SOTAVerified

Benchmarking

Papers

Showing 14811490 of 5548 papers

TitleStatusHype
Pedestrian Trajectory Prediction with Missing Data: Datasets, Imputation, and BenchmarkingCode1
PepMLM: Target Sequence-Conditioned Generation of Therapeutic Peptide Binders via Span Masked Language ModelingCode1
BiCo-Net: Regress Globally, Match Locally for Robust 6D Pose EstimationCode1
CovDocker: Benchmarking Covalent Drug Design with Tasks, Datasets, and SolutionsCode1
CSAW-M: An Ordinal Classification Dataset for Benchmarking Mammographic Masking of CancerCode1
Beyond Normal: On the Evaluation of Mutual Information EstimatorsCode1
DCL-Net: Deep Correspondence Learning Network for 6D Pose EstimationCode1
DetectRL: Benchmarking LLM-Generated Text Detection in Real-World ScenariosCode1
FinanceReasoning: Benchmarking Financial Numerical Reasoning More Credible, Comprehensive and ChallengingCode1
KoLA: Carefully Benchmarking World Knowledge of Large Language ModelsCode1
Show:102550
← PrevPage 149 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified