SOTAVerified

Benchmarking

Papers

Showing 11511160 of 5548 papers

TitleStatusHype
An OpenMind for 3D medical vision self-supervised learningCode2
First-frame Supervised Video Polyp Segmentation via Propagative and Semantic Dual-teacher NetworkCode0
HammerBench: Fine-Grained Function-Calling Evaluation in Real Mobile Device ScenariosCode0
Patherea: Cell Detection and Classification for the 2020s0
A Classification Benchmark for Artificial Intelligence Detection of Laryngeal Cancer from Patient VoiceCode0
Enriching Social Science Research via Survey Item LinkingCode0
Toward Robust Hyper-Detailed Image Captioning: A Multiagent Approach and Dual Evaluation Metrics for Factuality and Coverage0
Benchmarking LLMs and SLMs for patient reported outcomes0
AI-generated Image Quality Assessment in Visual CommunicationCode0
Deciphering the Underserved: Benchmarking LLM OCR for Low-Resource ScriptsCode0
Show:102550
← PrevPage 116 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified