SOTAVerified

High-Dimensional Regression with Binary Coefficients. Estimating Squared Error and a Phase Transition

2017-01-16Unverified0· sign in to hype

David Gamarnik, Ilias Zadik

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

We consider a sparse linear regression model Y=X ^*+W where X has a Gaussian entries, W is the noise vector with mean zero Gaussian entries, and ^* is a binary vector with support size (sparsity) k. Using a novel conditional second moment method we obtain a tight up to a multiplicative constant approximation of the optimal squared error _ \|Y-X \|_2, where the minimization is over all k-sparse binary vectors . The approximation reveals interesting structural properties of the underlying regression problem. In particular, a) We establish that n^*=2k p/ (2k/ ^2+1) is a phase transition point with the following "all-or-nothing" property. When n exceeds n^*, (2k)^-1\| _2- ^*\|_0 0, and when n is below n^*, (2k)^-1\| _2- ^*\|_0 1, where _2 is the optimal solution achieving the smallest squared error. With this we prove that n^* is the asymptotic threshold for recovering ^* information theoretically. b) We compute the squared error for an intermediate problem _ \|Y-X \|_2 where minimization is restricted to vectors with \| - ^*\|_0=2k , for [0,1]. We show that a lower bound part ( ) of the estimate, which corresponds to the estimate based on the first moment method, undergoes a phase transition at three different thresholds, namely n_inf,1= ^2 p, which is information theoretic bound for recovering ^* when k=1 and is large, then at n^* and finally at n_LASSO/CS. c) We establish a certain Overlap Gap Property (OGP) on the space of all binary vectors when n ck p for sufficiently small constant c. We conjecture that OGP is the source of algorithmic hardness of solving the minimization problem _ \|Y-X \|_2 in the regime n<n_LASSO/CS.

Tasks

Reproductions