SOTAVerified

Benchmarking

Papers

Showing 701710 of 5548 papers

TitleStatusHype
DeID-GPT: Zero-shot Medical Text De-Identification by GPT-4Code1
AllClear: A Comprehensive Dataset and Benchmark for Cloud Removal in Satellite ImageryCode1
Delving into Out-of-Distribution Detection with Medical Vision-Language ModelsCode1
A Ladder of Causal DistancesCode1
ATOMMIC: An Advanced Toolbox for Multitask Medical Imaging Consistency to facilitate Artificial Intelligence applications from acquisition to analysis in Magnetic Resonance ImagingCode1
Benchmarking Multidomain English-Indonesian Machine TranslationCode1
Dynatask: A Framework for Creating Dynamic AI Benchmark TasksCode1
Atom-Level Optical Chemical Structure Recognition with Limited SupervisionCode1
Benchmarking Large Language Models for News SummarizationCode1
RobFR: Benchmarking Adversarial Robustness on Face RecognitionCode1
Show:102550
← PrevPage 71 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified