SOTAVerified

Benchmarking

Papers

Showing 23712380 of 5548 papers

TitleStatusHype
Benchmarking Linguistic Diversity of Large Language ModelsCode0
DVFL-Net: A Lightweight Distilled Video Focal Modulation Network for Spatio-Temporal Action RecognitionCode0
Ducho meets Elliot: Large-scale Benchmarks for Multimodal RecommendationCode0
Geological Inference from Textual Data using Word EmbeddingsCode0
Global Prediction of COVID-19 Variant Emergence Using Dynamics-Informed Graph Neural NetworksCode0
Flexible Generation of Preference Data for Recommendation AnalysisCode0
Are Synthetic Corruptions A Reliable Proxy For Real-World Corruptions?Code0
DrugOOD: Out-of-Distribution (OOD) Dataset Curator and Benchmark for AI-aided Drug Discovery -- A Focus on Affinity Prediction Problems with Noise AnnotationsCode0
Benchmarking Learning Efficiency in Deep Reservoir ComputingCode0
DQI: Measuring Data Quality in NLPCode0
Show:102550
← PrevPage 238 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified