SOTAVerified

Benchmarking

Papers

Showing 391400 of 5548 papers

TitleStatusHype
Benchmarking Graph Neural NetworksCode2
Benchmarking Zero-shot Text Classification: Datasets, Evaluation and Entailment ApproachCode2
Habitat: A Platform for Embodied AI ResearchCode2
Benchmarking Neural Network Robustness to Common Corruptions and PerturbationsCode2
A large annotated medical image dataset for the development and evaluation of segmentation algorithmsCode2
Benchmarking Deep Reinforcement Learning for Continuous ControlCode2
LLMThinkBench: Towards Basic Math Reasoning and Overthinking in Large Language ModelsCode1
Latent Thermodynamic Flows: Unified Representation Learning and Generative Modeling of Temperature-Dependent Behaviors from Limited DataCode1
CovDocker: Benchmarking Covalent Drug Design with Tasks, Datasets, and SolutionsCode1
WattsOnAI: Measuring, Analyzing, and Visualizing Energy and Carbon Footprint of AI WorkloadsCode1
Show:102550
← PrevPage 40 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified