SOTAVerified

Benchmarking

Papers

Showing 24912500 of 5548 papers

TitleStatusHype
From Code to Play: Benchmarking Program Search for Games Using Large Language Models0
Asynchronous Batch Bayesian Optimization with Pipelining Evaluations for Experimental Resourcex2013constrained ConditionsCode0
Uniform Discretized Integrated Gradients: An effective attribution based method for explaining large language models0
ARTeFACT: Benchmarking Segmentation Models on Diverse Analogue Media Damage0
AdvDreamer Unveils: Are Vision-Language Models Truly Ready for Real-World 3D Variations?0
Benchmarking Attention Mechanisms and Consistency Regularization Semi-Supervised Learning for Post-Flood Building Damage Assessment in Satellite Images0
Benchmarking terminology building capabilities of ChatGPT on an English-Russian Fashion Corpus0
Benchmarking Pretrained Attention-based Models for Real-Time Recognition in Robot-Assisted Esophagectomy0
Benchmarking Harmonized Tariff Schedule Classification Models0
OODFace: Benchmarking Robustness of Face Recognition under Common Corruptions and Appearance Variations0
Show:102550
← PrevPage 250 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified