SOTAVerified

Benchmarking

Papers

Showing 48114820 of 5548 papers

TitleStatusHype
Benchmarking Spatiotemporal Reasoning in LLMs and Reasoning Models: Capabilities and ChallengesCode0
Forecasting Future International Events: A Reliable Dataset for Text-Based Event ModelingCode0
Benchmarking Single Image Dehazing and BeyondCode0
VRKitchen2.0-IndoorKit: A Tutorial for Augmented Indoor Scene Building in OmniverseCode0
One Law, Many Languages: Benchmarking Multilingual Legal Reasoning for Judicial SupportCode0
Forecasting Across Time Series Databases using Recurrent Neural Networks on Groups of Similar Series: A Clustering ApproachCode0
fMRI-S4: learning short- and long-range dynamic fMRI dependencies using 1D Convolutions and State Space ModelsCode0
Scaling and Benchmarking Self-Supervised Visual Representation LearningCode0
Scaling Compute Is Not All You Need for Adversarial RobustnessCode0
Scaling Up Resonate-and-Fire Networks for Fast Deep LearningCode0
Show:102550
← PrevPage 482 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified