SOTAVerified

Benchmarking

Papers

Showing 10711080 of 5548 papers

TitleStatusHype
Benchmarking Vision Foundation Models for Input Monitoring in Autonomous Driving0
Benchmarking Graph Representations and Graph Neural Networks for Multivariate Time Series ClassificationCode0
Benchmarking Abstractive Summarisation: A Dataset of Human-authored Summaries of Norwegian News Articles0
Stronger Than You Think: Benchmarking Weak Supervision on Realistic TasksCode0
The Paradox of Success in Evolutionary and Bioinspired Optimization: Revisiting Critical Issues, Key Studies, and Methodological Pathways0
Lessons From Red Teaming 100 Generative AI Products0
TimberVision: A Multi-Task Dataset and Framework for Log-Component Segmentation and Tracking in Autonomous Forestry OperationsCode1
Understanding and Benchmarking Artificial Intelligence: OpenAI's o3 Is Not AGI0
WebWalker: Benchmarking LLMs in Web TraversalCode11
Benchmarking YOLOv8 for Optimal Crack Detection in Civil Infrastructure0
Show:102550
← PrevPage 108 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified