SOTAVerified

Benchmarking

Papers

Showing 43314340 of 5548 papers

TitleStatusHype
VITAL: A New Dataset for Benchmarking Pluralistic Alignment in Healthcare0
VoiceWukong: Benchmarking Deepfake Voice Detection0
V-STaR: Benchmarking Video-LLMs on Video Spatio-Temporal Reasoning0
v-SVR Polynomial Kernel for Predicting the Defect Density in New Software Projects0
Vulnerability of Face Morphing Attacks: A Case Study on Lookalike and Identical Twins0
From Attack to Protection: Leveraging Watermarking Attack Network for Advanced Add-on Watermarking0
Ward: Provable RAG Dataset Inference via LLM Watermarks0
Watchog: A Light-weight Contrastive Learning based Framework for Column Annotation0
WebVision Challenge: Visual Learning and Understanding With Web Data0
WelQrate: Defining the Gold Standard in Small Molecule Drug Discovery Benchmarking0
Show:102550
← PrevPage 434 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified