SOTAVerified

Benchmarking

Papers

Showing 13611370 of 5548 papers

TitleStatusHype
Controlgym: Large-Scale Control Environments for Benchmarking Reinforcement Learning AlgorithmsCode1
Benchmarking the Combinatorial Generalizability of Complex Query Answering on Knowledge GraphsCode1
Benchmarking the CoW with the TopCoW Challenge: Topology-Aware Anatomical Segmentation of the Circle of Willis for CTA and MRACode1
CovDocker: Benchmarking Covalent Drug Design with Tasks, Datasets, and SolutionsCode1
Benchmarking Image Retrieval for Visual LocalizationCode1
ArabicaQA: A Comprehensive Dataset for Arabic Question AnsweringCode1
Benchmarking the Robustness of Deep Neural Networks to Common Corruptions in Digital PathologyCode1
Benchmarking human visual search computational models in natural scenes: models comparison and reference datasetsCode1
Machine Translation Meta Evaluation through Translation Accuracy Challenge SetsCode1
Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms in Cooperative TasksCode1
Show:102550
← PrevPage 137 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified