Mimic-IV-ICD: A new benchmark for eXtreme MultiLabel Classification
Thanh-Tung Nguyen, Viktor Schlegel, Abhinav Kashyap, Stefan Winkler, Shao-Syuan Huang, Jie-Jyun Liu, Chih-Jen Lin
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/thomasnguyen92/MIMIC-IV-ICD-data-processingOfficialnone★ 46
Abstract
Clinical notes are assigned ICD codes - sets of codes for diagnoses and procedures. In the recent years, predictive machine learning models have been built for automatic ICD coding. However, there is a lack of widely accepted benchmarks for automated ICD coding models based on large-scale public EHR data. This paper proposes a public benchmark suite for ICD-10 coding using a large EHR dataset derived from MIMIC-IV, the most recent public EHR dataset. We implement and compare several popular methods for ICD coding prediction tasks to standardize data preprocessing and establish a comprehensive ICD coding benchmark dataset. This approach fosters reproducibility and model comparison, accelerating progress toward employing automated ICD coding in future studies. Furthermore, we create a new ICD-9 benchmark using MIMIC-IV data, providing more data points and a higher number of ICD codes than MIMIC-III. Our open-source code offers easy access to data processing steps, benchmark creation, and experiment replication for those with MIMIC-IV access, providing insights, guidance, and protocols to efficiently develop ICD coding models.
Tasks
Benchmark Results
| Dataset | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| MIMIC-IV-ICD-10-full | CAML | Macro-AUC | 89.91 | — | Unverified |
| MIMIC-IV-ICD-10-full | PLM | Macro-AUC | 91.85 | — | Unverified |
| MIMIC-IV-ICD-10-full | LAAT | Macro-AUC | 92.96 | — | Unverified |
| MIMIC-IV-ICD-10-full | Joint LAAT | Macro-AUC | 93.64 | — | Unverified |
| MIMIC-IV-ICD-10-full | MSMN | Macro-AUC | 97.07 | — | Unverified |
| MIMIC-IV-ICD10-top50 | MSMN | F1 (micro) | 74.15 | — | Unverified |
| MIMIC-IV-ICD10-top50 | PLM-ICD | F1 (micro) | 73.27 | — | Unverified |
| MIMIC-IV-ICD10-top50 | Joint LAAT | F1 (micro) | 72.85 | — | Unverified |
| MIMIC-IV-ICD10-top50 | LAAT | F1 (micro) | 72.56 | — | Unverified |
| MIMIC-IV-ICD10-top50 | CAML | F1 (micro) | 67.56 | — | Unverified |
| MIMIC-IV-ICD9-full | CAML | Macro AUC | 93.45 | — | Unverified |
| MIMIC-IV-ICD9-full | LAAT | Macro AUC | 95.18 | — | Unverified |
| MIMIC-IV-ICD9-full | Joint LAAT | Macro AUC | 95.57 | — | Unverified |
| MIMIC-IV-ICD9-full | PLM-ICD | Macro AUC | 96.61 | — | Unverified |
| MIMIC-IV-ICD9-full | MSMN | Macro AUC | 96.79 | — | Unverified |
| MIMIC-IV-ICD9-top50 | MSMN | AUC Macro | 95.13 | — | Unverified |
| MIMIC-IV-ICD9-top50 | PLM-ICD | AUC Macro | 94.97 | — | Unverified |
| MIMIC-IV-ICD9-top50 | Joint LAAT | AUC Macro | 94.92 | — | Unverified |
| MIMIC-IV-ICD9-top50 | LAAT | AUC Macro | 94.88 | — | Unverified |
| MIMIC-IV-ICD9-top50 | CAML | AUC Macro | 93.07 | — | Unverified |