SOTAVerified

Benchmarking

Papers

Showing 29512960 of 5548 papers

TitleStatusHype
Demographic Parity: Mitigating Biases in Real-World Data0
NLPBench: Evaluating Large Language Models on Solving NLP ProblemsCode1
A Content-Driven Micro-Video Recommendation Dataset at ScaleCode2
Unified Long-Term Time-Series Forecasting BenchmarkCode1
Node-Aligned Graph-to-Graph (NAG2G): Elevating Template-Free Deep Learning Approaches in Single-Step RetrosynthesisCode1
Advancing The Rate-Distortion-Computation Frontier For Neural Image Compression0
A Toolkit for Reliable Benchmarking and Research in Multi-Objective Reinforcement LearningCode2
Thalamic nuclei segmentation from T_1-weighted MRI: unifying and benchmarking state-of-the-art methods with young and old cohorts0
On quantifying and improving realism of images generated with diffusion0
Optimization Techniques for a Physical Model of Human Vocalisation0
Show:102550
← PrevPage 296 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified