SOTAVerified

Benchmarking

Papers

Showing 32313240 of 5548 papers

TitleStatusHype
KRIS-Bench: Benchmarking Next-Level Intelligent Image Editing Models0
KRISTEVA: Close Reading as a Novel Task for Benchmarking Interpretive Reasoning0
K-Sort Arena: Efficient and Reliable Benchmarking for Generative Models via K-wise Human Preferences0
Benchmarking Ophthalmology Foundation Models for Clinically Significant Age Macular Degeneration Detection0
Benchmarking Open-Source Large Language Models on Healthcare Text Classification Tasks0
L3Cube-MahaSBERT and HindSBERT: Sentence BERT Models and Benchmarking BERT Sentence Representations for Hindi and Marathi0
L3 Fusion: Fast Transformed Convolutions on CPUs0
Advocating Character Error Rate for Multilingual ASR Evaluation0
Label Anchored Contrastive Learning for Language Understanding0
Comparison of Open-Source and Proprietary LLMs for Machine Reading Comprehension: A Practical Analysis for Industrial Applications0
Show:102550
← PrevPage 324 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified