SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1981–1990 of 5548 papers

Title	Date	Tasks	Status	Hype	Score
Illuminating the Diversity-Fitness Trade-Off in Black-Box Optimization	Aug 29, 2024	BenchmarkingDiversity	CodeCode Available	0	5
A Meta-Analysis of the Anomaly Detection Problem	Mar 3, 2015	Anomaly DetectionBenchmarking	CodeCode Available	0	5
Benchmarks for Graph Embedding Evaluation	Aug 19, 2019	BenchmarkingGraph Embedding	CodeCode Available	0	5
IdeaBench: Benchmarking Large Language Models for Research Idea Generation	Oct 31, 2024	Benchmarkingscientific discovery	CodeCode Available	0	5
Identifying and Benchmarking Natural Out-of-Context Prediction Problems	Oct 25, 2021	Benchmarking	CodeCode Available	0	5
BaDLAD: A Large Multi-Domain Bengali Document Layout Analysis Dataset	Mar 9, 2023	BenchmarkingDeep Learning	CodeCode Available	0	5
IceBench: A Benchmark for Deep Learning based Sea Ice Type Classification	Mar 22, 2025	BenchmarkingClassification	CodeCode Available	0	5
Benchmarking Flexible Electric Loads Scheduling Algorithms under Market Price Uncertainty	Feb 4, 2020	BenchmarkingDecision Making	CodeCode Available	0	5
ApisTox: a new benchmark dataset for the classification of small molecules toxicity on honey bees	Apr 24, 2024	BenchmarkingMolecular Property Prediction	CodeCode Available	0	5
Back to Basics: Benchmarking Canonical Evolution Strategies for Playing Atari	Feb 24, 2018	Atari GamesBenchmarking	CodeCode Available	0	5

Show:10 25 50

← PrevPage 199 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified