SOTAVerified

Benchmarking

Papers

Showing 30213030 of 5548 papers

TitleStatusHype
Speech Self-Supervised Representations Benchmarking: a Case for Larger Probing Heads0
MLLM-DataEngine: An Iterative Refinement Approach for MLLMCode1
Benchmarking Data Efficiency and Computational Efficiency of Temporal Action Localization Models0
Beyond Document Page Classification: Design, Datasets, and ChallengesCode0
Topical-Chat: Towards Knowledge-Grounded Open-Domain ConversationsCode2
Benchmarking Causal Study to Interpret Large Language Models for Source Code0
Finding the Perfect Fit: Applying Regression Models to ClimateBench v1.0Code0
LLMRec: Benchmarking Large Language Models on Recommendation TaskCode1
Efficient Benchmarking of Language Models0
Expecting The Unexpected: Towards Broad Out-Of-Distribution DetectionCode0
Show:102550
← PrevPage 303 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified