SOTAVerified

Benchmarking

Papers

Showing 32263250 of 5548 papers

TitleStatusHype
Generalization Bias in Large Language Model Summarization of Scientific Research0
Generalization, Mayhems and Limits in Recurrent Proximal Policy Optimization0
Generalized Attention Flow: Feature Attribution for Transformer Models via Maximum Flow0
Generalized Conflict-directed Search for Optimal Ordering Problems0
Generalizing Vision-Language Models to Novel Domains: A Comprehensive Survey0
General Scales Unlock AI Evaluation with Explanatory and Predictive Power0
Generating Artificial Outliers in the Absence of Genuine Ones -- a Survey0
Generating Automotive Code: Large Language Models for Software Development and Verification in Safety-Critical Systems0
Generating Diverse Synthetic Datasets for Evaluation of Real-life Recommender Systems0
Hierarchical Data Generator based on Tree-Structured Stick Breaking Process for Benchmarking Clustering Methods0
Generating Synthetic Electronic Health Record (EHR) Data: A Review with Benchmarking0
Generation of Large District Heating System Models Using Open-Source Data and Tools: An Exemplary Workflow0
Synthetic Observational Health Data with GANs: from slow adoption to a boom in medical research and ultimately digital twins?0
Generative Adversarial Networks with Limited Data: A Survey and Benchmarking0
Generative AI for Programming Education: Benchmarking ChatGPT, GPT-4, and Human Tutors0
Generative AI for Synthetic Data Across Multiple Medical Modalities: A Systematic Review of Recent Developments and Challenges0
Learning Dynamic Feature Selection for Fast Sequential Prediction0
Learning Environment Models with Continuous Stochastic Dynamics0
Learning Graphs for Knowledge Transfer With Limited Labels0
Learning Hidden Physics and System Parameters with Deep Operator Networks0
Learning Multimorbidity Patterns from Electronic Health Records Using Non-negative Matrix Factorisation0
Benchmarking Augmentation Methods for Learning Robust Navigation Agents: the Winning Entry of the 2021 iGibson Challenge0
Learning to Adapt to Online Streams with Distribution Shifts0
Realistic Large-Scale Fine-Depth Dehazing Dataset from 3D Videos0
Learning to Disambiguate by Asking Discriminative Questions0
Show:102550
← PrevPage 130 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified