SOTAVerified

Benchmarking

Papers

Showing 14711480 of 5548 papers

TitleStatusHype
Coursera Corpus Mining and Multistage Fine-Tuning for Improving Lectures TranslationCode1
Benchmarking Graph Neural Networks on Dynamic Link PredictionCode1
Benchmarking Graph Neural Networks for FMRI analysisCode1
Overcoming Common Flaws in the Evaluation of Selective Classification SystemsCode1
Beyond Normal: On the Evaluation of Mutual Information EstimatorsCode1
Can 3D Vision-Language Models Truly Understand Natural Language?Code1
CosPGD: an efficient white-box adversarial attack for pixel-wise prediction tasksCode1
CriticBench: Benchmarking LLMs for Critique-Correct ReasoningCode1
Active-Passive SimStereo -- Benchmarking the Cross-Generalization Capabilities of Deep Learning-based Stereo MethodsCode1
Data-Driven Denoising of Stationary Accelerometer SignalsCode1
Show:102550
← PrevPage 148 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified