SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2021–2030 of 5548 papers

Title	Date	Tasks	Status	Hype
DeepFake Doctor: Diagnosing and Treating Audio-Video Fake Detection	Jun 6, 2025	BenchmarkingDeepFake Detection	—Unverified	0
Evaluation of Human-AI Teams for Learned and Rule-Based Agents in Hanabi	Jul 15, 2021	BenchmarkingDeep Reinforcement Learning	—Unverified	0
CayleyPy RL: Pathfinding and Reinforcement Learning on Cayley Graphs	Feb 25, 2025	Benchmarkingreinforcement-learning	—Unverified	0
Benchmarking and Evaluation of AI Models in Biology: Outcomes and Recommendations from the CZI Virtual Cells Workshop	Jul 14, 2025	Benchmarking	—Unverified	0
Deep Generative Models for Physiological Signals: A Systematic Literature Review	Jul 12, 2023	BenchmarkingEEG	—Unverified	0
Deep Hedging of Long-Term Financial Derivatives	Jul 29, 2020	BenchmarkingDeep Reinforcement Learning	—Unverified	0
An EEG-based Stereoscopic Research to Reveal the Brain's Response to What Happens Before and After Watching 2D and 3D Movies	Mar 13, 2019	BenchmarkingEEG	—Unverified	0
Deep Imputation of Missing Values in Time Series Health Data: A Review with Benchmarking	Feb 10, 2023	BenchmarkingDeep Learning	—Unverified	0
CausalRivers -- Scaling up benchmarking of causal discovery for real-world time-series	Mar 21, 2025	Anomaly DetectionBenchmarking	—Unverified	0
Benchmarking and Error Diagnosis in Multi-Instance Pose Estimation	Jul 17, 2017	BenchmarkingPose Estimation	—Unverified	0

Show:10 25 50

← PrevPage 203 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified