SOTAVerified|Agents Browse Leaderboard About

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 871–880 of 5548 papers

Title	Date	Tasks	Status	Hype
Benchmarking Deep Graph Generative Models for Optimizing New Drug Molecules for COVID-19	Feb 9, 2021	BenchmarkingQ-Learning	CodeCode Available	1
AI in Lung Health: Benchmarking Detection and Diagnostic Models Across Multiple CT Scan Datasets	May 7, 2024	BenchmarkingCancer Classification	CodeCode Available	1
CIPCaD-Bench: Continuous Industrial Process datasets for benchmarking Causal Discovery methods	Aug 2, 2022	BenchmarkingCausal Discovery	CodeCode Available	1
Initial recommendations for performing, benchmarking, and reporting single-cell proteomics experiments	Jul 19, 2022	BenchmarkingExperimental Design	CodeCode Available	1
In Search of Lost Online Test-time Adaptation: A Survey	Oct 31, 2023	BenchmarkingGPU	CodeCode Available	1
Insights from Benchmarking Frontier Language Models on Web App Code Generation	Sep 8, 2024	BenchmarkingCode Generation	CodeCode Available	1
A Survey of Pathology Foundation Model: Progress and Future Directions	Apr 5, 2025	BenchmarkingMultiple Instance Learning	CodeCode Available	1
A Comprehensive Benchmark for RNA 3D Structure-Function Modeling	Mar 27, 2025	BenchmarkingDeep Learning	CodeCode Available	1
GEOM-Drugs Revisited: Toward More Chemically Accurate Benchmarks for 3D Molecule Generation	Apr 30, 2025	3D Molecule GenerationBenchmarking	CodeCode Available	1
Circumventing shortcuts in audio-visual deepfake detection datasets with unsupervised learning	Nov 29, 2024	BenchmarkingDeepFake Detection	CodeCode Available	1

Show:10 25 50

← PrevPage 88 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified