SOTAVerified

Benchmarking

Papers

Showing 23912400 of 5548 papers

TitleStatusHype
Leveraging Spatial and Semantic Feature Extraction for Skin Cancer Diagnosis with Capsule Networks and Graph Neural Networks0
Benchmarking the Robustness of UAV Tracking Against Common CorruptionsCode0
A Sober Look at the Robustness of CLIPs to Spurious Features0
FlowMind: Automatic Workflow Generation with LLMs0
Granular Change Accuracy: A More Accurate Performance Metric for Dialogue State Tracking0
Depression Detection on Social Media with Large Language Models0
An Improved Metric and Benchmark for Assessing the Performance of Virtual Screening ModelsCode1
Benchmarking Adversarial Robustness of Image Shadow Removal with Shadow-adaptive Attacks0
Histo-Genomic Knowledge Distillation For Cancer Prognosis From Histopathology Whole Slide ImagesCode1
Benchmarking Zero-Shot Robustness of Multimodal Foundation Models: A Pilot StudyCode0
Show:102550
← PrevPage 240 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified