Sensitivity of Survival Analysis Metrics
Iulii Vasilev, Mikhail Petrovskiy, Igor Mashechkin
Code Available — Be the first to reproduce this paper.
ReproduceCode
Abstract
Survival analysis models allow for predicting the probability of an event over time. The specificity of the survival analysis data includes the distribution of events over time and the proportion of classes. Late events are often rare and do not correspond to the main distribution and strongly affect the quality of the models and quality assessment. In this paper, we identify four cases of excessive sensitivity of survival analysis metrics and propose methods to overcome them. To set the equality of observation impacts, we adjust the weights of events based on target time and censoring indicator. According to the sensitivity of metrics, 𝐴𝑈𝑃𝑅𝐶 (area under Precision-Recall curve) is best suited for assessing the quality of survival models, and other metrics are used as loss functions. To evaluate the influence of the loss function, the 𝐵𝑎𝑔𝑔𝑖𝑛𝑔 model uses ones to select the size and hyperparameters of the ensemble. The experimental study included eight real medical datasets. The proposed modifications of 𝐼𝐵𝑆 (Integrated Brier Score) improved the quality of 𝐵𝑎𝑔𝑔𝑖𝑛𝑔 compared to the classical loss functions. In addition, in seven out of eight datasets, the 𝐵𝑎𝑔𝑔𝑖𝑛𝑔 with new loss functions outperforms the existing models of the scikit-survival library.