Spike-and-Slab Posterior Sampling in High Dimensions
Syamantak Kumar, Purnamrita Sarkar, Kevin Tian, Yusong Zhu
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
Posterior sampling with the spike-and-slab prior [MB88], a popular multimodal distribution used to model uncertainty in variable selection, is considered the theoretical gold standard method for Bayesian sparse linear regression [CPS09, Roc18]. However, designing provable algorithms for performing this sampling task is notoriously challenging. Existing posterior samplers for Bayesian sparse variable selection tasks either require strong assumptions about the signal-to-noise ratio (SNR) [YWJ16], only work when the measurement count grows at least linearly in the dimension [MW24], or rely on heuristic approximations to the posterior. We give the first provable algorithms for spike-and-slab posterior sampling that apply for any SNR, and use a measurement count sublinear in the problem dimension. Concretely, assume we are given a measurement matrix X R^n d and noisy observations y = X^ + of a signal ^ drawn from a spike-and-slab prior with a Gaussian diffuse density and expected sparsity k, where N(0_n, ^2I_n). We give a polynomial-time high-accuracy sampler for the posterior ( X, y), for any SNR ^-1 > 0, as long as n k^3 polylog(d) and X is drawn from a matrix ensemble satisfying the restricted isometry property. We further give a sampler that runs in near-linear time nd in the same setting, as long as n k^5 polylog(d). To demonstrate the flexibility of our framework, we extend our result to spike-and-slab posterior sampling with Laplace diffuse densities, achieving similar guarantees when = O(1k) is bounded.