SOTAVerified|Agents Browse Leaderboard About

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1241–1250 of 5548 papers

Title	Date	Tasks	Status	Hype
A Comprehensive Benchmark for COVID-19 Predictive Modeling Using Electronic Health Records in Intensive Care	Sep 16, 2022	BenchmarkingDeep Learning	CodeCode Available	1
GENEVA: Benchmarking Generalizability for Event Argument Extraction with Hundreds of Event Types and Argument Roles	May 25, 2022	BenchmarkingEvent Argument Extraction	CodeCode Available	1
CommonPower: A Framework for Safe Data-Driven Smart Grid Control	Jun 5, 2024	Benchmarkingenergy management	CodeCode Available	1
Benchmarking Language Model Creativity: A Case Study on Code Generation	Jul 12, 2024	BenchmarkingCode Generation	CodeCode Available	1
CompanyKG: A Large-Scale Heterogeneous Graph for Company Similarity Quantification	Jun 18, 2023	BenchmarkingRetrieval	CodeCode Available	1
CombiBench: Benchmarking LLM Capability for Combinatorial Mathematics	May 6, 2025	Benchmarking	CodeCode Available	1
A Comprehensive Benchmark for RNA 3D Structure-Function Modeling	Mar 27, 2025	BenchmarkingDeep Learning	CodeCode Available	1
GEOM-Drugs Revisited: Toward More Chemically Accurate Benchmarks for 3D Molecule Generation	Apr 30, 2025	3D Molecule GenerationBenchmarking	CodeCode Available	1
Collective Knowledge: organizing research projects as a database of reusable components and portable workflows with common APIs	Nov 2, 2020	Benchmarking	CodeCode Available	1
Combinatorial Optimization with Policy Adaptation using Latent Space Search	Nov 13, 2023	BenchmarkingCombinatorial Optimization	CodeCode Available	1

Show:10 25 50

← PrevPage 125 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified