SOTAVerified

Benchmarking

Papers

Showing 20012010 of 5548 papers

TitleStatusHype
Back to Basics: Benchmarking Canonical Evolution Strategies for Playing AtariCode0
Illusory VQA: Benchmarking and Enhancing Multimodal Models on Visual IllusionsCode0
Benchmark of Deep Learning Models on Large Healthcare MIMIC DatasetsCode0
AlphaZip: Neural Network-Enhanced Lossless Text CompressionCode0
Benchmarking Zero-Shot Robustness of Multimodal Foundation Models: A Pilot StudyCode0
Identifying Money Laundering Subgraphs on the BlockchainCode0
A Wild Bootstrap for Degenerate Kernel TestsCode0
Identifying the Smallest Adversarial Load Perturbations that Render DC-OPF InfeasibleCode0
Benchmarking YOLOv5 and YOLOv7 models with DeepSORT for droplet tracking applicationsCode0
Identifying and Benchmarking Natural Out-of-Context Prediction ProblemsCode0
Show:102550
← PrevPage 201 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified