SOTAVerified

Benchmarking

Papers

Showing 911920 of 5548 papers

TitleStatusHype
AbsPyramid: Benchmarking the Abstraction Ability of Language Models with a Unified Entailment GraphCode1
Benchmarking Data-driven Surrogate Simulators for Artificial Electromagnetic MaterialsCode1
ClimART: A Benchmark Dataset for Emulating Atmospheric Radiative Transfer in Weather and Climate ModelsCode1
A Comprehensive Benchmark for COVID-19 Predictive Modeling Using Electronic Health Records in Intensive CareCode1
AIGV-Assessor: Benchmarking and Evaluating the Perceptual Quality of Text-to-Video Generation with LMMCode1
ClearPose: Large-scale Transparent Object Dataset and BenchmarkCode1
Clinical Prompt Learning with Frozen Language ModelsCode1
Learning with Noisy Labels Revisited: A Study Using Real-World Human AnnotationsCode1
ShiftySpeech: A Large-Scale Synthetic Speech Dataset with Distribution ShiftsCode1
Large Scale MRI Collection and Segmentation of Cirrhotic LiverCode1
Show:102550
← PrevPage 92 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified