SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1481–1490 of 5548 papers

Title	Date	Tasks	Status	Hype
Pedestrian Trajectory Prediction with Missing Data: Datasets, Imputation, and Benchmarking	Oct 31, 2024	BenchmarkingImputation	CodeCode Available	1
PepMLM: Target Sequence-Conditioned Generation of Therapeutic Peptide Binders via Span Masked Language Modeling	Oct 5, 2023	BenchmarkingLanguage Modeling	CodeCode Available	1
BiCo-Net: Regress Globally, Match Locally for Robust 6D Pose Estimation	May 7, 2022	6D Pose EstimationBenchmarking	CodeCode Available	1
CovDocker: Benchmarking Covalent Drug Design with Tasks, Datasets, and Solutions	Jun 26, 2025	BenchmarkingDrug Design	CodeCode Available	1
CSAW-M: An Ordinal Classification Dataset for Benchmarking Mammographic Masking of Cancer	Dec 2, 2021	BenchmarkingOrdinal Classification	CodeCode Available	1
Beyond Normal: On the Evaluation of Mutual Information Estimators	Jun 19, 2023	BenchmarkingDomain Generalization	CodeCode Available	1
DCL-Net: Deep Correspondence Learning Network for 6D Pose Estimation	Oct 11, 2022	6D Pose Estimation6D Pose Estimation using RGB	CodeCode Available	1
DetectRL: Benchmarking LLM-Generated Text Detection in Real-World Scenarios	Oct 31, 2024	BenchmarkingLLM-generated Text Detection	CodeCode Available	1
FinanceReasoning: Benchmarking Financial Numerical Reasoning More Credible, Comprehensive and Challenging	Jun 6, 2025	Benchmarking	CodeCode Available	1
KoLA: Carefully Benchmarking World Knowledge of Large Language Models	Jun 15, 2023	BenchmarkingHallucination	CodeCode Available	1

Show:10 25 50

← PrevPage 149 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified