SOTAVerified

Benchmarking

Papers

Showing 16311640 of 5548 papers

TitleStatusHype
Benchmarking Deep Learning Architectures for Predicting Readmission to the ICU and Describing Patients-at-RiskCode0
Keep Security! Benchmarking Security Policy Preservation in Large Language Model Contexts Against Indirect Attacks in Question AnsweringCode0
Knowing-how & Knowing-that: A New Task for Machine Comprehension of User ManualsCode0
Joint Multi-Scale Tone Mapping and Denoising for HDR Image EnhancementCode0
A New Cervical Cytology Dataset for Nucleus Detection and Image Classification (Cervix93) and Methods for Cervical Nucleus DetectionCode0
A new baseline for retinal vessel segmentation: Numerical identification and correction of methodological inconsistencies affecting 100+ papersCode0
JExplore: Design Space Exploration Tool for Nvidia Jetson BoardsCode0
A Biologically Plausible Benchmark for Contextual Bandit Algorithms in Precision Oncology Using in vitro DataCode0
JATE 2.0: Java Automatic Term Extraction with Apache SolrCode0
Is Your Model Fairly Certain? Uncertainty-Aware Fairness Evaluation for LLMsCode0
Show:102550
← PrevPage 164 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified