SOTAVerified

Benchmarking

Papers

Showing 701710 of 5548 papers

TitleStatusHype
Chaos as an interpretable benchmark for forecasting and data-driven modellingCode1
AllClear: A Comprehensive Dataset and Benchmark for Cloud Removal in Satellite ImageryCode1
CCTV-Gun: Benchmarking Handgun Detection in CCTV ImagesCode1
A Comprehensive Study on Large-Scale Graph Training: Benchmarking and RethinkingCode1
Automatic sleep stage classification with deep residual networks in a mixed-cohort settingCode1
Cross-Modal Bidirectional Interaction Model for Referring Remote Sensing Image SegmentationCode1
CSAW-M: An Ordinal Classification Dataset for Benchmarking Mammographic Masking of CancerCode1
Towards Reliable Detection of LLM-Generated Texts: A Comprehensive Evaluation Framework with CUDRTCode1
CharacterBench: Benchmarking Character Customization of Large Language ModelsCode1
A Ladder of Causal DistancesCode1
Show:102550
← PrevPage 71 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified