SOTAVerified

Benchmarking

Papers

Showing 24512460 of 5548 papers

TitleStatusHype
Generalization and Regularization in DQNCode0
Distributing Deep Learning Hyperparameter Tuning for 3D Medical Image SegmentationCode0
A Framework for Generating Informative Benchmark InstancesCode0
Generative Models for Fast Simulation of Cherenkov Detectors at the Electron-Ion ColliderCode0
Flexible Generation of Preference Data for Recommendation AnalysisCode0
A Classification Benchmark for Artificial Intelligence Detection of Laryngeal Cancer from Patient VoiceCode0
Distributed Non-Convex Optimization with Sublinear Speedup under Intermittent Client AvailabilityCode0
GenCeption: Evaluate Multimodal LLMs with Unlabeled Unimodal DataCode0
Dissecting Sample Hardness: A Fine-Grained Analysis of Hardness Characterization Methods for Data-Centric AICode0
Dissecting Dissonance: Benchmarking Large Multimodal Models Against Self-Contradictory InstructionsCode0
Show:102550
← PrevPage 246 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified