SOTAVerified

Benchmarking

Papers

Showing 24812490 of 5548 papers

TitleStatusHype
MANTA: A Large-Scale Multi-View and Visual-Text Anomaly Detection Dataset for Tiny Objects0
An Experimental Evaluation of Imputation Models for Spatial-Temporal Traffic DataCode0
Learning Hidden Physics and System Parameters with Deep Operator Networks0
ACT-Bench: Towards Action Controllable World Models for Autonomous Driving0
MozzaVID: Mozzarella Volumetric Image Dataset0
Benchmarking Open-ended Audio Dialogue Understanding for Large Audio-Language Models0
Benchmarking and Enhancing Surgical Phase Recognition Models for Robotic-Assisted Esophagectomy0
From Code to Play: Benchmarking Program Search for Games Using Large Language Models0
Magnetic Resonance Imaging Feature-Based Subtyping and Model Ensemble for Enhanced Brain Tumor SegmentationCode0
T2I-FactualBench: Benchmarking the Factuality of Text-to-Image Models with Knowledge-Intensive Concepts0
Show:102550
← PrevPage 249 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified