SOTAVerified|Agents Browse Leaderboard About

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1391–1400 of 5548 papers

Title	Date	Tasks	Status	Hype	Score
Benchmarking Image Retrieval for Visual Localization	Nov 24, 2020	Autonomous DrivingBenchmarking	CodeCode Available	1	5
ArabicaQA: A Comprehensive Dataset for Arabic Question Answering	Mar 26, 2024	BenchmarkingMachine Reading Comprehension	CodeCode Available	1	5
Benchmarking human visual search computational models in natural scenes: models comparison and reference datasets	Dec 10, 2021	Benchmarking	CodeCode Available	1	5
Interpretable statistical representations of neural population dynamics and geometry	Apr 6, 2023	BenchmarkingDecision Making	CodeCode Available	1	5
InstructTTSEval: Benchmarking Complex Natural-Language Instruction Following in Text-to-Speech Systems	Jun 19, 2025	BenchmarkingDescriptive	CodeCode Available	1	5
Dynatask: A Framework for Creating Dynamic AI Benchmark Tasks	Apr 5, 2022	Benchmarking	CodeCode Available	1	5
Physiology-based simulation of the retinal vasculature enables annotation-free segmentation of OCT angiographs	Jul 22, 2022	BenchmarkingRetinal Vessel Segmentation	CodeCode Available	1	5
PIC4rl-gym: a ROS2 modular framework for Robots Autonomous Navigation with Deep Reinforcement Learning	Nov 19, 2022	Autonomous NavigationBenchmarking	CodeCode Available	1	5
Aquatic Navigation: A Challenging Benchmark for Deep Reinforcement Learning	May 30, 2024	Autonomous DrivingBenchmarking	CodeCode Available	1	5
IntelliGraphs: Datasets for Benchmarking Knowledge Graph Generation	Jul 13, 2023	BenchmarkingGraph Embedding	CodeCode Available	1	5

Show:10 25 50

← PrevPage 140 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified