SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 31–40 of 5548 papers

Title	Date	Tasks	Status	Hype	Score
Benchmarking Graphormer on Large-Scale Molecular Modeling Datasets	Mar 9, 2022	BenchmarkingGraph Regression	CodeCode Available	4	5
OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning	Dec 31, 2024	BenchmarkingLogical Reasoning	CodeCode Available	4	5
Molecular-driven Foundation Model for Oncologic Pathology	Jan 28, 2025	BenchmarkingDiagnostic	CodeCode Available	4	5
Meta Audiobox Aesthetics: Unified Automatic Quality Assessment for Speech, Music, and Sound	Feb 7, 2025	Benchmarking	CodeCode Available	4	5
MLPerf Power: Benchmarking the Energy Efficiency of Machine Learning Systems from Microwatts to Megawatts for Sustainable AI	Oct 15, 2024	Benchmarking	CodeCode Available	4	5
MTEB: Massive Text Embedding Benchmark	Oct 13, 2022	BenchmarkingInformation Retrieval	CodeCode Available	4	5
Enabling more efficient and cost-effective AI/ML systems with Collective Mind, virtualized MLOps, MLPerf, Collective Knowledge Playground and reproducible optimization tournaments	Jun 24, 2024	Benchmarking	CodeCode Available	4	5
Aequitas Flow: Streamlining Fair ML Experimentation	May 9, 2024	BenchmarkingFairness	CodeCode Available	4	5
Dora: Sampling and Benchmarking for 3D Shape Variational Auto-Encoders	Dec 23, 2024	3D Shape ModelingBenchmarking	CodeCode Available	4	5
I Think, Therefore I am: Benchmarking Awareness of Large Language Models Using AwareBench	Jan 31, 2024	BenchmarkingMultiple-choice	CodeCode Available	4	5

Show:10 25 50

← PrevPage 4 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified