SOTAVerified

Benchmarking

Papers

Showing 11011110 of 5548 papers

TitleStatusHype
A Comprehensive Benchmark for COVID-19 Predictive Modeling Using Electronic Health Records in Intensive CareCode1
ScreenQA: Large-Scale Question-Answer Pairs over Mobile App ScreenshotsCode1
Benchmarking Multimodal Variational Autoencoders: CdSprites+ Dataset and ToolkitCode1
Structural Bias for Aspect Sentiment Triplet ExtractionCode1
nnOOD: A Framework for Benchmarking Self-supervised Anomaly Localisation MethodsCode1
Benchmarking Compositionality with Formal LanguagesCode1
CIPCaD-Bench: Continuous Industrial Process datasets for benchmarking Causal Discovery methodsCode1
A Multifaceted Benchmarking of Synthetic Electronic Health Record Generation ModelsCode1
Accelerated and interpretable oblique random survival forestsCode1
Tracking Every Thing in the WildCode1
Show:102550
← PrevPage 111 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified