SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1991–2000 of 5548 papers

Title	Date	Tasks	Status	Hype
How far are today's time-series models from real-world weather forecasting applications?	Jun 20, 2024	BenchmarkingTime Series	CodeCode Available	2
The Elusive Pursuit of Reproducing PATE-GAN: Benchmarking, Auditing, Debugging	Jun 20, 2024	Benchmarking	CodeCode Available	0
Benchmarking Monocular 3D Dog Pose Estimation Using In-The-Wild Motion Capture Data	Jun 20, 2024	Animal Pose EstimationBenchmarking	—Unverified	0
African or European Swallow? Benchmarking Large Vision-Language Models for Fine-Grained Object Classification	Jun 20, 2024	BenchmarkingClassification	CodeCode Available	1
HoTPP Benchmark: Are We Good at the Long Horizon Events Forecasting?	Jun 20, 2024	BenchmarkingPoint Processes	CodeCode Available	2
Resource-efficient Medical Image Analysis with Self-adapting Forward-Forward Networks	Jun 20, 2024	BenchmarkingMedical Image Analysis	—Unverified	0
DASB -- Discrete Audio and Speech Benchmark	Jun 20, 2024	BenchmarkingEmotion Recognition	—Unverified	0
A Benchmarking Study of Kolmogorov-Arnold Networks on Tabular Data	Jun 20, 2024	BenchmarkingKolmogorov-Arnold Networks	CodeCode Available	1
FairX: A comprehensive benchmarking tool for model analysis using fairness, utility, and explainability	Jun 20, 2024	BenchmarkingFairness	CodeCode Available	0
PoseBench: Benchmarking the Robustness of Pose Estimation Models under Corruptions	Jun 20, 2024	Animal Pose EstimationAutonomous Driving	—Unverified	0

Show:10 25 50

← PrevPage 200 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified