SOTAVerified

Benchmarking

Papers

Showing 29712980 of 5548 papers

TitleStatusHype
Benchmarking the Robustness of Quantized Models0
Vulnerability of Face Morphing Attacks: A Case Study on Lookalike and Identical Twins0
Ice Cream Doesn't Cause Drowning: Benchmarking LLMs Against Statistical Pitfalls in Causal Inference0
ICE-ID: A Novel Historical Census Data Benchmark Comparing NARS against LLMs, \& a ML Ensemble on Longitudinal Identity Resolution0
ICON^2: Reliably Benchmarking Predictive Inequity in Object Detection0
Benchmarking the Robustness of Panoptic Segmentation for Automated Driving0
The Labyrinth of Links: Navigating the Associative Maze of Multi-modal LLMs0
Identifiable Convex-Concave Regression via Sub-gradient Regularised Least Squares0
Identification of vortex in unstructured mesh with graph neural networks0
The Leaderboard Illusion0
Show:102550
← PrevPage 298 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified