SOTAVerified

Benchmarking

Papers

Showing 10311040 of 5548 papers

TitleStatusHype
CODEBench: A Neural Architecture and Hardware Accelerator Co-Design FrameworkCode1
Coarse-to-Fine Q-attention with Learned Path RankingCode1
AutoAdvExBench: Benchmarking autonomous exploitation of adversarial example defensesCode1
CO-Bench: Benchmarking Language Model Agents in Algorithm Search for Combinatorial OptimizationCode1
CoDEx: A Comprehensive Knowledge Graph Completion BenchmarkCode1
A skeletonization algorithm for gradient-based optimizationCode1
Benchmarking Visual Localization for Autonomous NavigationCode1
Clinical Prompt Learning with Frozen Language ModelsCode1
ClimART: A Benchmark Dataset for Emulating Atmospheric Radiative Transfer in Weather and Climate ModelsCode1
A GPU-accelerated Large-scale Simulator for Transportation System Optimization BenchmarkingCode1
Show:102550
← PrevPage 104 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified