SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1481–1490 of 5548 papers

Title	Date	Tasks	Status	Hype
Benchmarking Natural Language Understanding Services for building Conversational Agents	Mar 13, 2019	BenchmarkingGeneral Classification	CodeCode Available	1
NAS-Bench-101: Towards Reproducible Neural Architecture Search	Feb 25, 2019	BenchmarkingNeural Architecture Search	CodeCode Available	1
The StarCraft Multi-Agent Challenge	Feb 11, 2019	BenchmarkingMuJoCo	CodeCode Available	1
The Liver Tumor Segmentation Benchmark (LiTS)	Jan 13, 2019	BenchmarkingComputed Tomography (CT)	CodeCode Available	1
LEAF: A Benchmark for Federated Settings	Dec 3, 2018	Autonomous VehiclesBenchmarking	CodeCode Available	1
GuacaMol: Benchmarking Models for De Novo Molecular Design	Nov 22, 2018	BenchmarkingDrug Discovery	CodeCode Available	1
IOHprofiler: A Benchmarking and Profiling Tool for Iterative Optimization Heuristics	Oct 11, 2018	Benchmarking	CodeCode Available	1
On Evaluation of Embodied Navigation Agents	Jul 18, 2018	Benchmarking	CodeCode Available	1
Benchmarking Neural Network Robustness to Common Corruptions and Surface Variations	Jul 4, 2018	Adversarial DefenseBenchmarking	CodeCode Available	1
Texygen: A Benchmarking Platform for Text Generation Models	Feb 6, 2018	BenchmarkingDiversity	CodeCode Available	1

Show:10 25 50

← PrevPage 149 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified