SOTAVerified

Benchmarking

Papers

Showing 19411950 of 5548 papers

TitleStatusHype
AntiLeak-Bench: Preventing Data Contamination by Automatically Constructing Benchmarks with Updated Real-World KnowledgeCode0
CURATe: Benchmarking Personalised Alignment of Conversational AI AssistantsCode0
A Modular Workflow for Performance Benchmarking of Neuronal Network SimulationsCode0
Beyond Marginal Uncertainty: How Accurately can Bayesian Regression Models Estimate Posterior Predictive Correlations?Code0
Illuminating the Diversity-Fitness Trade-Off in Black-Box OptimizationCode0
Illusory VQA: Benchmarking and Enhancing Multimodal Models on Visual IllusionsCode0
Beyond Document Page Classification: Design, Datasets, and ChallengesCode0
A Modular Benchmarking Infrastructure for High-Performance and Reproducible Deep LearningCode0
IJCB 2022 Mobile Behavioral Biometrics Competition (MobileB2C)Code0
BASED: Benchmarking, Analysis, and Structural Estimation of DeblurringCode0
Show:102550
← PrevPage 195 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified