SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2251–2260 of 5548 papers

Title	Date	Tasks	Status	Hype	Score
Dynamic Neighborhood Construction for Structured Large Discrete Action Spaces	May 31, 2023	BenchmarkingRecommendation Systems	CodeCode Available	0	5
Harmonization Benchmarking Tool for Neuroimaging Datasets	Nov 15, 2022	BenchmarkingDiffusion MRI	CodeCode Available	0	5
Benchmarking Multimodal RAG through a Chart-based Document Question-Answering Generation Framework	Feb 20, 2025	BenchmarkingQuestion Answering	CodeCode Available	0	5
Guidelines and Benchmarks for Deployment of Deep Learning Models on Smartphones as Real-Time Apps	Jan 8, 2019	BenchmarkingCPU	CodeCode Available	0	5
Benchmarking Multimodal CoT Reward Model Stepwise by Visual Program	Apr 9, 2025	Benchmarking	CodeCode Available	0	5
A Seq2Seq approach to Symbolic Regression	Oct 17, 2020	Benchmarkingregression	CodeCode Available	0	5
Grounding Synthetic Data Evaluations of Language Models in Unsupervised Document Corpora	May 13, 2025	BenchmarkingDiagnostic	CodeCode Available	0	5
gym-gazebo2, a toolkit for reinforcement learning using ROS 2 and Gazebo	Mar 14, 2019	BenchmarkingOpenAI Gym	CodeCode Available	0	5
Harnessing Orthogonality to Train Low-Rank Neural Networks	Jan 16, 2024	Benchmarking	CodeCode Available	0	5
Benchmarking Multilabel Topic Classification in the Kyrgyz Language	Aug 30, 2023	BenchmarkingClassification	CodeCode Available	0	5

Show:10 25 50

← PrevPage 226 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified