SOTAVerified

Benchmarking

Papers

Showing 21812190 of 5548 papers

TitleStatusHype
Benchmarking Fish Dataset and Evaluation Metric in Keypoint Detection -- Towards Precise Fish Morphological Assessment in Aquaculture BreedingCode1
CT-Eval: Benchmarking Chinese Text-to-Table Performance in Large Language Models0
EXACT: Towards a platform for empirically benchmarking Machine Learning model explanation methods0
Large-Scale Multi-Center CT and MRI Segmentation of Pancreas with Deep LearningCode2
DispaRisk: Auditing Fairness Through Usable InformationCode0
MTVQA: Benchmarking Multilingual Text-Centric Visual Question AnsweringCode2
EnviroExam: Benchmarking Environmental Science Knowledge of Large Language Models0
From Generalist to Specialist: Improving Large Language Models for Medical Physics Using ARCoT0
SMP Challenge: An Overview and Analysis of Social Media Prediction Challenge0
BraTS-Path Challenge: Assessing Heterogeneous Histopathologic Brain Tumor Sub-regions0
Show:102550
← PrevPage 219 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified