Outlier-robust sparse/low-rank least-squares regression and robust matrix completion
Philip Thompson
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/philipthomp/Outlier-robust-regressionOfficialIn papernone★ 2
Abstract
We study high-dimensional least-squares regression within a subgaussian statistical learning framework with heterogeneous noise. It includes s-sparse and r-low-rank least-squares regression when a fraction of the labels are adversarially contaminated. We also present a novel theory of trace-regression with matrix decomposition based on a new application of the product process. For these problems, we show novel near-optimal "subgaussian" estimation rates of the form r(n,d_e)+(1/)/n+(1/), valid with probability at least 1-. Here, r(n,d_e) is the optimal uncontaminated rate as a function of the effective dimension d_e but independent of the failure probability . These rates are valid uniformly on , i.e., the estimators' tuning do not depend on . Lastly, we consider noisy robust matrix completion with non-uniform sampling. If only the low-rank matrix is of interest, we present a novel near-optimal rate that is independent of the corruption level a. Our estimators are tractable and based on a new "sorted" Huber-type loss. No information on (s,r,,a) are needed to tune these estimators. Our analysis makes use of novel -optimal concentration inequalities for the multiplier and product processes which could be useful elsewhere. For instance, they imply novel sharp oracle inequalities for Lasso and Slope with optimal dependence on . Numerical simulations confirm our theoretical predictions. In particular, "sorted" Huber regression can outperform classical Huber regression.