SOTAVerified

Benchmarking

Papers

Showing 10211030 of 5548 papers

TitleStatusHype
FedAIoT: A Federated Learning Benchmark for Artificial Intelligence of ThingsCode1
Failure Detection in Medical Image Classification: A Reality Check and Benchmarking TestbedCode1
Fantastic Questions and Where to Find Them: FairytaleQA -- An Authentic Dataset for Narrative ComprehensionCode1
Benchmarking Offline Reinforcement Learning on Real-Robot HardwareCode1
AutoAdvExBench: Benchmarking autonomous exploitation of adversarial example defensesCode1
Benchmarking Omni-Vision Representation through the Lens of Visual RealmsCode1
Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning AlgorithmsCode1
FedCV: A Federated Learning Framework for Diverse Computer Vision TasksCode1
FiFAR: A Fraud Detection Dataset for Learning to DeferCode1
A skeletonization algorithm for gradient-based optimizationCode1
Show:102550
← PrevPage 103 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified