SOTAVerified

Reproducing and Improving CheXNet: Deep Learning for Chest X-ray Disease Classification

2025-05-10Code Available1· sign in to hype

Daniel Strick, Carlos Garcia, Anthony Huang

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Deep learning for radiologic image analysis is a rapidly growing field in biomedical research and is likely to become a standard practice in modern medicine. On the publicly available NIH ChestX-ray14 dataset, containing X-ray images that are classified by the presence or absence of 14 different diseases, we reproduced an algorithm known as CheXNet, as well as explored other algorithms that outperform CheXNet's baseline metrics. Model performance was primarily evaluated using the F1 score and AUC-ROC, both of which are critical metrics for imbalanced, multi-label classification tasks in medical imaging. The best model achieved an average AUC-ROC score of 0.85 and an average F1 score of 0.39 across all 14 disease classifications present in the dataset.

Tasks

Benchmark Results

DatasetModelMetricClaimedVerifiedStatus
ChestX-ray14Improved CheXNet (DannyNet, dstrick17 et al., 2025)Average AUC on 14 label85.27Unverified

Reproductions