SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2501–2510 of 5548 papers

Title	Date	Tasks	Status	Hype
Benchmarking Sample Selection Strategies for Batch Reinforcement Learning	Sep 29, 2021	BenchmarkingImitation Learning	—Unverified	0
A Comprehensive Study on Robustness of Image Classification Models: Benchmarking and Rethinking	Feb 28, 2023	Adversarial RobustnessBenchmarking	—Unverified	0
Geospatial Foundation Models to Enable Progress on Sustainable Development Goals	May 30, 2025	BenchmarkingEarth Observation	—Unverified	0
GiCCS: A German in-Context Conversational Similarity Benchmark	Dec 16, 2022	BenchmarkingSemantic Textual Similarity	—Unverified	0
Benchmarking Safe Deep Reinforcement Learning in Aquatic Navigation	Dec 16, 2021	BenchmarkingDeep Reinforcement Learning	—Unverified	0
Benchmarking Rotary Position Embeddings for Automatic Speech Recognition	Jan 10, 2025	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
7th AI Driving Olympics: 1st Place Report for Panoptic Tracking	Dec 9, 2021	BenchmarkingPanoptic Segmentation	—Unverified	0
Genicious: Contextual Few-shot Prompting for Insights Discovery	Mar 15, 2025	BenchmarkingDecision Making	—Unverified	0
A Theory of Dynamic Benchmarks	Oct 6, 2022	Benchmarking	—Unverified	0
ATG: Benchmarking Automated Theorem Generation for Generative Language Models	May 5, 2024	Automated Theorem ProvingBenchmarking	—Unverified	0

Show:10 25 50

← PrevPage 251 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified