Subspace approximation with outliers

2020-06-30Unverified0· sign in to hype

Amit Deshpande, Rameshwar Pratap

Unverified — Be the first to reproduce this paper.

Abstract

The subspace approximation problem with outliers, for given n points in d dimensions x_1,, x_n R^d, an integer 1 k d, and an outlier parameter 0 1, is to find a k-dimensional linear subspace of R^d that minimizes the sum of squared distances to its nearest (1-)n points. More generally, the _p subspace approximation problem with outliers minimizes the sum of p-th powers of distances instead of the sum of squared distances. Even the case of robust PCA is non-trivial, and previous work requires additional assumptions on the input. Any multiplicative approximation algorithm for the subspace approximation problem with outliers must solve the robust subspace recovery problem, a special case in which the (1-)n inliers in the optimal solution are promised to lie exactly on a k-dimensional linear subspace. However, robust subspace recovery is Small Set Expansion (SSE)-hard. We show how to extend dimension reduction techniques and bi-criteria approximations based on sampling to the problem of subspace approximation with outliers. To get around the SSE-hardness of robust subspace recovery, we assume that the squared distance error of the optimal k-dimensional subspace summed over the optimal (1-)n inliers is at least times its squared-error summed over all n points, for some 0 < 1 - . With this assumption, we give an efficient algorithm to find a subset of poly(k/) (1/) (1/) points whose span contains a k-dimensional subspace that gives a multiplicative (1+)-approximation to the optimal solution. The running time of our algorithm is linear in n and d. Interestingly, our results hold even when the fraction of outliers is large, as long as the obvious condition 0 < 1 - is satisfied.

Tasks

Dimensionality Reduction

Subspace approximation with outliers

Abstract

Tasks

Reproductions