SOTAVerified

Benchmarking

Papers

Showing 42514260 of 5548 papers

TitleStatusHype
Uncertainty Estimation with Deep Learning for Rainfall-Runoff Modelling0
Understanding and Benchmarking Artificial Intelligence: OpenAI's o3 Is Not AGI0
Understanding Foundation Models: Are We Back in 1924?0
Understanding or Manipulation: Rethinking Online Performance Gains of Modern Recommender Systems0
Understanding Recurrent Neural Architectures by Analyzing and Synthesizing Long Distance Dependencies in Benchmark Sequential Datasets0
Understanding the Limits of Lifelong Knowledge Editing in LLMs0
Understanding the RoPE Extensions of Long-Context LLMs: An Attention Perspective0
Understanding the User: An Intent-Based Ranking Dataset0
Uniform Discretized Integrated Gradients: An effective attribution based method for explaining large language models0
Unifying Few- and Zero-Shot Egocentric Action Recognition0
Show:102550
← PrevPage 426 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified