SOTAVerified

Benchmarking

Papers

Showing 20912100 of 5548 papers

TitleStatusHype
Data-driven Power Flow Linearization: Simulation0
Improving Generalization of Neural Vehicle Routing Problem Solvers Through the Lens of Model ArchitectureCode0
INTERSPEECH 2009 Emotion Challenge Revisited: Benchmarking 15 Years of Progress in Speech Emotion RecognitionCode0
DISCOVERYWORLD: A Virtual Environment for Developing and Evaluating Automated Scientific Discovery AgentsCode3
Can Language Models Serve as Text-Based World Simulators?0
Multivariate Stochastic Dominance via Optimal Transport and Applications to Models Benchmarking0
TopoBench: A Framework for Benchmarking Topological Deep LearningCode3
Smiles2Dock: an open large-scale multi-task dataset for ML-based molecular dockingCode1
QGEval: Benchmarking Multi-dimensional Evaluation for Question GenerationCode1
ICU-Sepsis: A Benchmark MDP Built from Real Medical DataCode1
Show:102550
← PrevPage 210 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified