SOTAVerified|Agents Browse Leaderboard About

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1451–1460 of 5548 papers

Title	Date	Tasks	Status	Hype
Towards Motion Forecasting with Real-World Perception Inputs: Are End-to-End Approaches Competitive?	Jun 15, 2023	Autonomous DrivingAutonomous Vehicles	CodeCode Available	1
Online Learning with Optimism and Delay	Jun 13, 2021	BenchmarkingWeather Forecasting	CodeCode Available	1
D2S: Document-to-Slide Generation Via Query-Based Text Summarization	May 8, 2021	BenchmarkingLong Form Question Answering	CodeCode Available	1
CounselBench: A Large-Scale Expert Evaluation and Adversarial Benchmark of Large Language Models in Mental Health Counseling	Jun 10, 2025	Benchmarking	CodeCode Available	1
CosPGD: an efficient white-box adversarial attack for pixel-wise prediction tasks	Feb 4, 2023	Adversarial AttackAdversarial Robustness	CodeCode Available	1
Coursera Corpus Mining and Multistage Fine-Tuning for Improving Lectures Translation	Dec 26, 2019	BenchmarkingDomain Adaptation	CodeCode Available	1
CHOICE: Benchmarking the Remote Sensing Capabilities of Large Vision-Language Models	Nov 27, 2024	BenchmarkingEarth Observation	CodeCode Available	1
OpenABC-D: A Large-Scale Dataset For Machine Learning Guided Integrated Circuit Synthesis	Oct 21, 2021	BenchmarkingBIG-bench Machine Learning	CodeCode Available	1
CovDocker: Benchmarking Covalent Drug Design with Tasks, Datasets, and Solutions	Jun 26, 2025	BenchmarkingDrug Design	CodeCode Available	1
Benchmarking Graph Neural Networks on Dynamic Link Prediction	Sep 29, 2021	BenchmarkingDynamic Link Prediction	CodeCode Available	1

Show:10 25 50

← PrevPage 146 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified