SOTAVerified

Benchmarking

Papers

Showing 43414350 of 5548 papers

TitleStatusHype
WER We Stand: Benchmarking Urdu ASR Models0
What can 5.17 billion regression fits tell us about artificial models of the human visual system?0
What cleaves? Is proteasomal cleavage prediction reaching a ceiling?0
What Does Neuro Mean to Cardio? Investigating the Role of Clinical Specialty Data in Medical LLMs0
What Emotions Make One or Five Stars? Understanding Ratings of Online Product Reviews by Sentiment Analysis and XAI0
What if we had no Wikipedia? Domain-independent Term Extraction from a Large News Corpus0
Alexpaca: Learning Factual Clarification Question Generation Without Examples0
What Motivates You? Benchmarking Automatic Detection of Basic Needs from Short Posts0
Towards Self-adaptive Mutation in Evolutionary Multi-Objective Algorithms0
What Will it Take to Fix Benchmarking in Natural Language Understanding?0
Show:102550
← PrevPage 435 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified