SOTAVerified|Agents Browse Leaderboard About

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 391–400 of 5548 papers

Title	Date	Tasks	Status	Hype
Benchmarking Graph Neural Networks	Mar 2, 2020	BenchmarkingGraph Classification	CodeCode Available	2
Benchmarking Zero-shot Text Classification: Datasets, Evaluation and Entailment Approach	Aug 31, 2019	ArticlesBenchmarking	CodeCode Available	2
Habitat: A Platform for Embodied AI Research	Apr 2, 2019	BenchmarkingGPU	CodeCode Available	2
Benchmarking Neural Network Robustness to Common Corruptions and Perturbations	Mar 28, 2019	Adversarial DefenseBenchmarking	CodeCode Available	2
A large annotated medical image dataset for the development and evaluation of segmentation algorithms	Feb 25, 2019	BenchmarkingSegmentation	CodeCode Available	2
Benchmarking Deep Reinforcement Learning for Continuous Control	Apr 22, 2016	Action Triplet RecognitionAtari Games	CodeCode Available	2
LLMThinkBench: Towards Basic Math Reasoning and Overthinking in Large Language Models	Jul 5, 2025	BenchmarkingGPU	CodeCode Available	1
Latent Thermodynamic Flows: Unified Representation Learning and Generative Modeling of Temperature-Dependent Behaviors from Limited Data	Jul 3, 2025	BenchmarkingRepresentation Learning	CodeCode Available	1
CovDocker: Benchmarking Covalent Drug Design with Tasks, Datasets, and Solutions	Jun 26, 2025	BenchmarkingDrug Design	CodeCode Available	1
WattsOnAI: Measuring, Analyzing, and Visualizing Energy and Carbon Footprint of AI Workloads	Jun 25, 2025	Benchmarking	CodeCode Available	1

Show:10 25 50

← PrevPage 40 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified