SOTAVerified

Benchmarking

Papers

Showing 16211630 of 5548 papers

TitleStatusHype
Keep Security! Benchmarking Security Policy Preservation in Large Language Model Contexts Against Indirect Attacks in Question AnsweringCode0
An Experimental Evaluation of Imputation Models for Spatial-Temporal Traffic DataCode0
KArSL: Arabic Sign Language DatabaseCode0
Benchmarking Children's ASR with Supervised and Self-supervised Speech Foundation ModelsCode0
Joint Multi-Scale Tone Mapping and Denoising for HDR Image EnhancementCode0
Benchmarking ChatGPT on Algorithmic ReasoningCode0
Benchmarking ChatGPT-4 on ACR Radiation Oncology In-Training (TXIT) Exam and Red Journal Gray Zone Cases: Potentials and Challenges for AI-Assisted Medical Education and Decision Making in Radiation OncologyCode0
JExplore: Design Space Exploration Tool for Nvidia Jetson BoardsCode0
Benchmarking Deep Learning Architectures for Predicting Readmission to the ICU and Describing Patients-at-RiskCode0
KamNet: An Integrated Spatiotemporal Deep Neural Network for Rare Event Search in KamLAND-ZenCode0
Show:102550
← PrevPage 163 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified