SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1511–1520 of 5548 papers

Title	Date	Tasks	Status	Hype	Score
Benchmarking Framework for Performance-Evaluation of Causal Inference Analysis	Feb 14, 2018	BenchmarkingCausal Inference	CodeCode Available	0	5
Benchmarking framework for machine learning classification from fNIRS data	Mar 3, 2023	BenchmarkingBrain Computer Interface	CodeCode Available	0	5
Benchmarking Foundation Models on Exceptional Cases: Dataset Creation and Validation	Oct 23, 2024	ArticlesBenchmarking	CodeCode Available	0	5
Knowledge Enhanced Conditional Imputation for Healthcare Time-series	Dec 27, 2023	BenchmarkingImputation	CodeCode Available	0	5
SCoRE: Benchmarking Long-Chain Reasoning in Commonsense Scenarios	Mar 8, 2025	BenchmarkingDiagnostic	CodeCode Available	0	5
LABCAT: Locally adaptive Bayesian optimization using principal-component-aligned trust regions	Nov 19, 2023	Bayesian OptimizationBenchmarking	CodeCode Available	0	5
A Position Paper on the Automatic Generation of Machine Learning Leaderboards	May 23, 2025	BenchmarkingPosition	CodeCode Available	0	5
ADVIO: An authentic dataset for visual-inertial odometry	Jul 25, 2018	Benchmarking	CodeCode Available	0	5
Knowing-how & Knowing-that: A New Task for Machine Comprehension of User Manuals	Jun 7, 2023	BenchmarkingMachine Reading Comprehension	CodeCode Available	0	5
ApisTox: a new benchmark dataset for the classification of small molecules toxicity on honey bees	Apr 24, 2024	BenchmarkingMolecular Property Prediction	CodeCode Available	0	5

Show:10 25 50

← PrevPage 152 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified