SOTAVerified|Agents Browse Leaderboard About

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1441–1450 of 5548 papers

Title	Date	Tasks	Status	Hype
NTIRE 2020 Challenge on Real-World Image Super-Resolution: Methods and Results	May 5, 2020	BenchmarkingImage Super-Resolution	CodeCode Available	1
NuCLS: A scalable crowdsourcing, deep learning approach and dataset for nucleus classification, localization and segmentation	Feb 18, 2021	BenchmarkingInterpretable Machine Learning	CodeCode Available	1
AQuA: A Benchmarking Tool for Label Quality Assessment	Jun 15, 2023	BenchmarkingLabel Error Detection	CodeCode Available	1
Object Shape Error Response Using Bayesian 3-D Convolutional Neural Networks for Assembly Systems With Compliant Parts	Dec 8, 2021	3D Shape ModelingBenchmarking	CodeCode Available	1
CosPGD: an efficient white-box adversarial attack for pixel-wise prediction tasks	Feb 4, 2023	Adversarial AttackAdversarial Robustness	CodeCode Available	1
Benchpress: A Scalable and Versatile Workflow for Benchmarking Structure Learning Algorithms	Jul 8, 2021	Benchmarking	CodeCode Available	1
APTv2: Benchmarking Animal Pose Estimation and Tracking with a Large-scale Dataset and Beyond	Dec 25, 2023	Animal Pose EstimationBenchmarking	CodeCode Available	1
CHOICE: Benchmarking the Remote Sensing Capabilities of Large Vision-Language Models	Nov 27, 2024	BenchmarkingEarth Observation	CodeCode Available	1
CounselBench: A Large-Scale Expert Evaluation and Adversarial Benchmark of Large Language Models in Mental Health Counseling	Jun 10, 2025	Benchmarking	CodeCode Available	1
Contemporary Symbolic Regression Methods and their Relative Performance	Jul 29, 2021	Benchmarkingparameter estimation	CodeCode Available	1

Show:10 25 50

← PrevPage 145 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified