Integrated path stability selection

2024-03-23Code Available0· sign in to hype

Omar Melikechi, Jeffrey W. Miller

Code Available — Be the first to reproduce this paper.

Code

github.com/omelikechi/ipss
OfficialIn papernone★ 1
github.com/omelikechi/ipssr
none★ 3

Abstract

Stability selection is a popular method for improving feature selection algorithms. One of its key attributes is that it provides theoretical upper bounds on the expected number of false positives, E(FP), enabling control of false positives in practice. However, stability selection often selects very few features, resulting in low sensitivity. This is because existing bounds on E(FP) are relatively loose, causing stability selection to overestimate the number of false positives. In this paper, we introduce a novel approach to stability selection based on integrating stability paths rather than maximizing over them. This yields upper bounds on E(FP) that are orders of magnitude stronger than previous bounds, leading to significantly more true positives in practice for the same target E(FP). Furthermore, our method takes the same amount of computation as the original stability selection algorithm, and only requires one user-specified parameter, which can be either the target E(FP) or target false discovery rate. We demonstrate the method on simulations and real data from prostate and colon cancer studies.

Tasks

feature selection

Integrated path stability selection

Code

Abstract

Tasks

Reproductions