SOTAVerified

Benchmarking

Papers

Showing 28112820 of 5548 papers

TitleStatusHype
A Two-Step Framework for Multi-Material Decomposition of Dual Energy Computed Tomography from Projection Domain0
Next-generation MRD assays: do we have the tools to evaluate them properly?0
In Search of Lost Online Test-time Adaptation: A SurveyCode1
What's In My Big Data?Code2
Theory of Mind in Large Language Models: Examining Performance of 11 State-of-the-Art models vs. Children Aged 7-10 on Advanced Tests0
Battle of the Backbones: A Large-Scale Comparison of Pretrained Models across Computer Vision TasksCode2
Domain Generalization in Computational Pathology: Survey and Guidelines0
A Metadata-Driven Approach to Understand Graph Neural Networks0
Re-evaluating Retrosynthesis Algorithms with SyntheseusCode1
LLMs and Finetuning: Benchmarking cross-domain performance for hate speech detection0
Show:102550
← PrevPage 282 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified