SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1761–1770 of 5548 papers

Title	Date	Tasks	Status	Hype	Score
Cable Tree Wiring -- Benchmarking Solvers on a Real-World Scheduling Problem with a Variety of Precedence Constraints	Nov 25, 2020	BenchmarkingScheduling	CodeCode Available	0	5
inMOTIFin: a lightweight end-to-end simulation software for regulatory sequences	Jun 25, 2025	Benchmarking	CodeCode Available	0	5
Benchmarking ChatGPT-4 on ACR Radiation Oncology In-Training (TXIT) Exam and Red Journal Gray Zone Cases: Potentials and Challenges for AI-Assisted Medical Education and Decision Making in Radiation Oncology	Apr 24, 2023	BenchmarkingDecision Making	CodeCode Available	0	5
M4Fog: A Global Multi-Regional, Multi-Modal, and Multi-Stage Dataset for Marine Fog Detection and Forecasting to Bridge Ocean and Atmosphere	Jun 19, 2024	BenchmarkingSpatio-Temporal Forecasting	CodeCode Available	0	5
B-XAIC Dataset: Benchmarking Explainable AI for Graph Neural Networks Using Chemical Data	May 28, 2025	BenchmarkingDrug Discovery	CodeCode Available	0	5
Benchmarking ChatGPT on Algorithmic Reasoning	Apr 4, 2024	Benchmarking	CodeCode Available	0	5
Analysis \| OPEN \| Published: 17 June 2019 Multitask learning and benchmarking with clinical time series data	Jun 17, 2019	BenchmarkingBIG-bench Machine Learning	CodeCode Available	0	5
Building Conformal Prediction Intervals with Approximate Message Passing	Oct 21, 2024	BenchmarkingConformal Prediction	CodeCode Available	0	5
Building and benchmarking an Arabic Speech Commands dataset for small-footprint keyword spotting	May 7, 2021	BenchmarkingDeep Learning	CodeCode Available	0	5
Adaptive Visual Scene Understanding: Incremental Scene Graph Generation	Oct 2, 2023	BenchmarkingContinual Learning	CodeCode Available	0	5

Show:10 25 50

← PrevPage 177 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified