SOTAVerified

Benchmarking

Papers

Showing 35313540 of 5548 papers

TitleStatusHype
An efficiency analysis of Spanish airports0
A Comprehensive Summarization and Evaluation of Feature Refinement Modules for CTR PredictionCode0
DeepPatent2: A Large-Scale Benchmarking Corpus for Technical Drawing UnderstandingCode0
Benchmarking Deep Facial Expression Recognition: An Extensive Protocol with Balanced Dataset in the Wild0
Benchmarking Differential Evolution on a Quantum Simulator0
Exploitation-Guided Exploration for Semantic Embodied Navigation0
Benchmarking a Benchmark: How Reliable is MS-COCO?0
Learning Disentangled Speech Representations0
Multi-EuP: The Multilingual European Parliament Dataset for Analysis of Bias in Information RetrievalCode0
Grounded Intuition of GPT-Vision's Abilities with Scientific ImagesCode0
Show:102550
← PrevPage 354 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified