SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3581–3590 of 5548 papers

Title	Date	Tasks	Status	Hype
On the Use of Quality Diversity Algorithms for The Traveling Thief Problem	Dec 16, 2021	BenchmarkingDiversity	—Unverified	0
On the Utility of Equivariance and Symmetry Breaking in Deep Learning Architectures on Point Clouds	Jan 1, 2025	Benchmarking	—Unverified	0
On the Value of ML Models	Dec 13, 2021	Benchmarking	—Unverified	0
OOD-CV-v2: An extended Benchmark for Robustness to Out-of-Distribution Shifts of Individual Nuisances in Natural Images	Apr 17, 2023	3D Pose EstimationBenchmarking	—Unverified	0
OODFace: Benchmarking Robustness of Face Recognition under Common Corruptions and Appearance Variations	Dec 3, 2024	BenchmarkingFace Recognition	—Unverified	0
OOD-Speech: A Large Bengali Speech Recognition Dataset for Out-of-Distribution Benchmarking	May 15, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
Open-CD: A Comprehensive Toolbox for Change Detection	Jul 22, 2024	BenchmarkingChange Detection	—Unverified	0
OpenContrails: Benchmarking Contrail Detection on GOES-16 ABI	Apr 4, 2023	Benchmarking	—Unverified	0
Open Datasets for Satellite Radio Resource Control	Apr 22, 2024	BenchmarkingDecision Making	—Unverified	0
OpenDeception: Benchmarking and Investigating AI Deceptive Behaviors via Open-ended Interaction Simulation	Apr 18, 2025	Benchmarking	—Unverified	0

Show:10 25 50

← PrevPage 359 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified