SOTAVerified

Sparse Features for PCA-Like Linear Regression

2011-12-01NeurIPS 2011Unverified0· sign in to hype

Christos Boutsidis, Petros Drineas, Malik Magdon-Ismail

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

Principal Components Analysis~(PCA) is often used as a feature extraction procedure. Given a matrix X R^n d, whose rows represent n data points with respect to d features, the top k right singular vectors of X (the so-called eigenfeatures), are arbitrary linear combinations of all available features. The eigenfeatures are very useful in data analysis, including the regularization of linear regression. Enforcing sparsity on the eigenfeatures, i.e., forcing them to be linear combinations of only a small number of actual features (as opposed to all available features), can promote better generalization error and improve the interpretability of the eigenfeatures. We present deterministic and randomized algorithms that construct such sparse eigenfeatures while provably achieving in-sample performance comparable to regularized linear regression. Our algorithms are relatively simple and practically efficient, and we demonstrate their performance on several data sets.

Tasks

Reproductions