SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2451–2460 of 5548 papers

Title	Date	Tasks	Status	Hype	Score
Generalization and Regularization in DQN	Sep 29, 2018	Atari GamesBenchmarking	CodeCode Available	0	5
Distributing Deep Learning Hyperparameter Tuning for 3D Medical Image Segmentation	Oct 29, 2021	BenchmarkingBrain Tumor Segmentation	CodeCode Available	0	5
A Framework for Generating Informative Benchmark Instances	May 29, 2022	Benchmarking	CodeCode Available	0	5
Generative Models for Fast Simulation of Cherenkov Detectors at the Electron-Ion Collider	Apr 26, 2025	BenchmarkingGPU	CodeCode Available	0	5
Flexible Generation of Preference Data for Recommendation Analysis	Jul 23, 2024	BenchmarkingRecommendation Systems	CodeCode Available	0	5
A Classification Benchmark for Artificial Intelligence Detection of Laryngeal Cancer from Patient Voice	Dec 20, 2024	BenchmarkingDiagnostic	CodeCode Available	0	5
Distributed Non-Convex Optimization with Sublinear Speedup under Intermittent Client Availability	Feb 18, 2020	BenchmarkingFederated Learning	CodeCode Available	0	5
GenCeption: Evaluate Multimodal LLMs with Unlabeled Unimodal Data	Feb 22, 2024	Benchmarking	CodeCode Available	0	5
Dissecting Sample Hardness: A Fine-Grained Analysis of Hardness Characterization Methods for Data-Centric AI	Mar 7, 2024	Benchmarking	CodeCode Available	0	5
Dissecting Dissonance: Benchmarking Large Multimodal Models Against Self-Contradictory Instructions	Aug 2, 2024	Benchmarkingmultimodal interaction	CodeCode Available	0	5

Show:10 25 50

← PrevPage 246 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified