SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1541–1550 of 5548 papers

Title	Date	Tasks	Status	Hype	Score
Knowledge Enhanced Conditional Imputation for Healthcare Time-series	Dec 27, 2023	BenchmarkingImputation	CodeCode Available	0	5
SCoRE: Benchmarking Long-Chain Reasoning in Commonsense Scenarios	Mar 8, 2025	BenchmarkingDiagnostic	CodeCode Available	0	5
Towards Enhancing Fault Tolerance in Neural Networks	Jul 6, 2019	Benchmarking	CodeCode Available	0	5
KhabarChin: Automatic Detection of Important News in the Persian Language	Dec 6, 2023	ArticlesBenchmarking	CodeCode Available	0	5
AntiLeak-Bench: Preventing Data Contamination by Automatically Constructing Benchmarks with Updated Real-World Knowledge	Dec 18, 2024	BenchmarkingWorld Knowledge	CodeCode Available	0	5
Ants can orienteer a thief in their robbery	Apr 15, 2020	BenchmarkingCombinatorial Optimization	CodeCode Available	0	5
Knowing-how & Knowing-that: A New Task for Machine Comprehension of User Manuals	Jun 7, 2023	BenchmarkingMachine Reading Comprehension	CodeCode Available	0	5
Benchmarking Educational Program Repair	May 8, 2024	BenchmarkingProgram Repair	CodeCode Available	0	5
ANTHROPOS-V: benchmarking the novel task of Crowd Volume Estimation	Jan 3, 2025	BenchmarkingCrowd Counting	CodeCode Available	0	5
Adversarial Environment Generation for Learning to Navigate the Web	Mar 2, 2021	BenchmarkingDecision Making	CodeCode Available	0	5

Show:10 25 50

← PrevPage 155 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified