SOTAVerified|Agents Browse Leaderboard About

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1241–1250 of 5548 papers

Title	Date	Tasks	Status	Hype	Score
A Comprehensive Benchmark for COVID-19 Predictive Modeling Using Electronic Health Records in Intensive Care	Sep 16, 2022	BenchmarkingDeep Learning	CodeCode Available	1	5
HAWKS: Evolving Challenging Benchmark Sets for Cluster Analysis	Feb 13, 2021	BenchmarkingClustering	CodeCode Available	1	5
Benchmarking Language Model Creativity: A Case Study on Code Generation	Jul 12, 2024	BenchmarkingCode Generation	CodeCode Available	1	5
CLoG: Benchmarking Continual Learning of Image Generation Models	Jun 7, 2024	BenchmarkingContinual Learning	CodeCode Available	1	5
Clinical Prompt Learning with Frozen Language Models	May 11, 2022	BenchmarkingGPU	CodeCode Available	1	5
Benchmarking structure-based three-dimensional molecular generative models using GenBench3D: ligand conformation quality matters	Jul 5, 2024	Benchmarkingvalid	CodeCode Available	1	5
HazeSpace2M: A Dataset for Haze Aware Single Image Dehazing	Sep 25, 2024	BenchmarkingImage Dehazing	CodeCode Available	1	5
HaloQuest: A Visual Hallucination Dataset for Advancing Multimodal Reasoning	Jul 22, 2024	BenchmarkingHallucination	CodeCode Available	1	5
Benchmarking Spectral Graph Neural Networks: A Comprehensive Study on Effectiveness and Efficiency	Jun 14, 2024	Benchmarking	CodeCode Available	1	5
HateBench: Benchmarking Hate Speech Detectors on LLM-Generated Content and Hate Campaigns	Jan 28, 2025	Adversarial AttackBenchmarking	CodeCode Available	1	5

Show:10 25 50

← PrevPage 125 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified