SOTAVerified

Benchmarking

Papers

Showing 19811990 of 5548 papers

TitleStatusHype
Illuminating the Diversity-Fitness Trade-Off in Black-Box OptimizationCode0
A Meta-Analysis of the Anomaly Detection ProblemCode0
Benchmarks for Graph Embedding EvaluationCode0
IdeaBench: Benchmarking Large Language Models for Research Idea GenerationCode0
Identifying and Benchmarking Natural Out-of-Context Prediction ProblemsCode0
BaDLAD: A Large Multi-Domain Bengali Document Layout Analysis DatasetCode0
IceBench: A Benchmark for Deep Learning based Sea Ice Type ClassificationCode0
Benchmarking Flexible Electric Loads Scheduling Algorithms under Market Price UncertaintyCode0
ApisTox: a new benchmark dataset for the classification of small molecules toxicity on honey beesCode0
Back to Basics: Benchmarking Canonical Evolution Strategies for Playing AtariCode0
Show:102550
← PrevPage 199 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified