SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2761–2770 of 5548 papers

Title	Date	Tasks	Status	Hype
AI Idea Bench 2025: AI Research Idea Generation Benchmark	Apr 19, 2025	Benchmarkingscientific discovery	—Unverified	0
Exploring the Adversarial Frontier: Quantifying Robustness via Adversarial Hypervolume	Mar 8, 2024	Adversarial RobustnessBenchmarking	—Unverified	0
ImageNet performance correlates with pose estimation robustness and generalization on out-of-domain data	Jul 17, 2020	Animal Pose EstimationBenchmarking	—Unverified	0
Improved YOLOv12 with LLM-Generated Synthetic Data for Enhanced Apple Detection and Benchmarking Against YOLOv11 and YOLOv10	Feb 26, 2025	Benchmarkingobject-detection	—Unverified	0
A Survey of Model Compression and Acceleration for Deep Neural Networks	Oct 23, 2017	BenchmarkingKnowledge Distillation	—Unverified	0
Geometric feature performance under downsampling for EEG classification tasks	Feb 15, 2021	BenchmarkingClassification	—Unverified	0
Benchmarking Poisoning Attacks against Retrieval-Augmented Generation	May 24, 2025	BenchmarkingQuestion Answering	—Unverified	0
Geometry Matters: Benchmarking Scientific ML Approaches for Flow Prediction around Complex Geometries	Dec 31, 2024	BenchmarkingOut-of-Distribution Generalization	—Unverified	0
Image2Struct: Benchmarking Structure Extraction for Vision-Language Models	Oct 29, 2024	Benchmarking	—Unverified	0
Exploring Continual Learning of Diffusion Models	Mar 27, 2023	BenchmarkingContinual Learning	—Unverified	0

Show:10 25 50

← PrevPage 277 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified