SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4521–4530 of 5548 papers

Title	Date	Tasks	Status	Hype
The Collective Knowledge project: making ML models more portable and reproducible with open APIs, reusable best practices and MLOps	Jun 12, 2020	Benchmarkingobject-detection	CodeCode Available	0
a-DCF: an architecture agnostic metric with application to spoofing-robust speaker verification	Mar 3, 2024	BenchmarkingSpeaker Verification	CodeCode Available	0
Resource Interoperability for Sustainable Benchmarking: The Case of Events	May 1, 2018	Benchmarking	CodeCode Available	0
Bayesian Neural Networks with Soft Evidence	Oct 19, 2020	Benchmarking	CodeCode Available	0
BASED: Benchmarking, Analysis, and Structural Estimation of Deblurring	May 27, 2023	BenchmarkingDeblurring	CodeCode Available	0
Bugs in the Data: How ImageNet Misrepresents Biodiversity	Aug 24, 2022	BenchmarkingObject Detection	CodeCode Available	0
inMOTIFin: a lightweight end-to-end simulation software for regulatory sequences	Jun 25, 2025	Benchmarking	CodeCode Available	0
LexSumm and LexT5: Benchmarking and Modeling Legal Summarization Tasks in English	Oct 12, 2024	Benchmarking	CodeCode Available	0
InDL: A New Dataset and Benchmark for In-Diagram Logic Interpretation based on Visual Illusion	May 28, 2023	BenchmarkingDecision Making	CodeCode Available	0
Individual Fairness Guarantees for Neural Networks	May 11, 2022	BenchmarkingFairness	CodeCode Available	0

Show:10 25 50

← PrevPage 453 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified