SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4501–4510 of 5548 papers

Title	Date	Tasks	Status	Hype
Beyond MD17: the reactive xxMD dataset	Aug 22, 2023	BenchmarkingComputational chemistry	CodeCode Available	0
The biglasso Package: A Memory- and Computation-Efficient Solver for Lasso Model Fitting with Big Data in R	Jan 20, 2017	Benchmarking	CodeCode Available	0
Learning to Transfer for Traffic Forecasting via Multi-task Learning	Nov 27, 2021	BenchmarkingDomain Adaptation	CodeCode Available	0
IOLBENCH: Benchmarking LLMs on Linguistic Reasoning	Jan 8, 2025	Benchmarking	CodeCode Available	0
InViG: Benchmarking Interactive Visual Grounding with 500K Human-Robot Interactions	Oct 18, 2023	BenchmarkingVisual Grounding	CodeCode Available	0
Investigating the Impact of Hard Samples on Accuracy Reveals In-class Data Imbalance	Sep 22, 2024	AutoMLBenchmarking	CodeCode Available	0
BEARD: Benchmarking the Adversarial Robustness for Dataset Distillation	Nov 14, 2024	Adversarial AttackAdversarial Robustness	CodeCode Available	0
RerrFact: Reduced Evidence Retrieval Representations for Scientific Claim Verification	Feb 5, 2022	BenchmarkingBinary Classification	CodeCode Available	0
Inverse Contextual Bandits: Learning How Behavior Evolves over Time	Jul 13, 2021	BenchmarkingDecision Making	CodeCode Available	0
UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models	Oct 17, 2024	Benchmarking	CodeCode Available	0

Show:10 25 50

← PrevPage 451 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified