SOTAVerified|Agents Browse Leaderboard About

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 861–870 of 5548 papers

Title	Date	Tasks	Status	Hype	Score
Introducing Milabench: Benchmarking Accelerators for AI	Nov 18, 2024	BenchmarkingDeep Learning	CodeCode Available	1	5
Introducing the VoicePrivacy Initiative	May 4, 2020	Benchmarking	CodeCode Available	1	5
GEOM-Drugs Revisited: Toward More Chemically Accurate Benchmarks for 3D Molecule Generation	Apr 30, 2025	3D Molecule GenerationBenchmarking	CodeCode Available	1	5
CheX-GPT: Harnessing Large Language Models for Enhanced Chest X-ray Report Labeling	Jan 21, 2024	Benchmarking	CodeCode Available	1	5
An Evaluation Dataset for Intent Classification and Out-of-Scope Prediction	Sep 4, 2019	BenchmarkingGeneral Classification	CodeCode Available	1	5
Benchmarking Batch Deep Reinforcement Learning Algorithms	Oct 3, 2019	BenchmarkingDeep Reinforcement Learning	CodeCode Available	1	5
Benchmarking Multimodal Knowledge Conflict for Large Multimodal Models	May 26, 2025	BenchmarkingRAG	CodeCode Available	1	5
EH-DNAS: End-to-End Hardware-aware Differentiable Neural Architecture Search	Nov 24, 2021	BenchmarkingNeural Architecture Search	CodeCode Available	1	5
Emoji Prediction: Extensions and Benchmarking	Jul 14, 2020	BenchmarkingMulti-Label Classification	CodeCode Available	1	5
Benchmarking Low-Shot Robustness to Natural Distribution Shifts	Apr 21, 2023	Benchmarking	CodeCode Available	1	5

Show:10 25 50

← PrevPage 87 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified