Decomposing Probabilistic Scores: Reliability, Information Loss and Uncertainty
Arthur Charpentier, Agathe Fernandes-Machado
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
Calibration is a conditional property that depends on the information retained by a predictor. We develop decomposition identities for arbitrary proper losses that make this dependence explicit. At any information level A, the expected loss of an A-measurable predictor splits into a proper-regret (reliability) term and a conditional entropy (residual uncertainty) term. For nested levels A B, a chain decomposition quantifies the information gain from A to B. Applied to classification with features X and score S=s(X), this yields a three-term identity: miscalibration, a grouping term measuring information loss from X to S, and irreducible uncertainty at the feature level. We leverage the framework to analyze post-hoc recalibration, aggregation of calibrated models, and stagewise/boosting constructions, with explicit forms for Brier and log-loss.