SOTAVerified

Benchmarking

Papers

Showing 30613070 of 5548 papers

TitleStatusHype
A framework for benchmarking uncertainty in deep regression0
Individual Treatment Effect Estimation Through Controlled Neural Network Training in Two Stages0
The Pitfalls of Benchmarking in Algorithm Selection: What We Are Getting Wrong0
IndoLEM and IndoBERT: A Benchmark Dataset and Pre-trained Language Model for Indonesian NLP0
Benchmarking symbolic regression constant optimization schemes0
Benchmarking Surrogate-Assisted Genetic Recommender Systems0
Benchmarking Super-Resolution Algorithms on Real Data0
Influence-Optimistic Local Values for Multiagent Planning --- Extended Version0
InfoDeepSeek: Benchmarking Agentic Information Seeking for Retrieval-Augmented Generation0
Benchmarking Sub-Genre Classification For Mainstage Dance Music0
Show:102550
← PrevPage 307 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified