SOTAVerified

Benchmarking

Papers

Showing 191200 of 5548 papers

TitleStatusHype
Event-Based Motion MagnificationCode2
FairMedFM: Fairness Benchmarking for Medical Imaging Foundation ModelsCode2
EV2Gym: A Flexible V2G Simulator for EV Smart Charging Research and BenchmarkingCode2
Aria Digital Twin: A New Benchmark Dataset for Egocentric 3D Machine PerceptionCode2
FluidLab: A Differentiable Environment for Benchmarking Complex Fluid ManipulationCode2
FormalMATH: Benchmarking Formal Mathematical Reasoning of Large Language ModelsCode2
A Toolkit for Reliable Benchmarking and Research in Multi-Objective Reinforcement LearningCode2
Foundational Models Defining a New Era in Vision: A Survey and OutlookCode2
EvalGIM: A Library for Evaluating Generative Image ModelsCode2
Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual EditingCode2
Show:102550
← PrevPage 20 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified