Outlier Robust Multivariate Polynomial Regression
Vipul Arora, Arnab Bhattacharyya, Mathews Boban, Venkatesan Guruswami, Esty Kelman
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
We study the problem of robust multivariate polynomial regression: let pR^nR be an unknown n-variate polynomial of degree at most d in each variable. We are given as input a set of random samples (x_i,y_i) [-1,1]^n R that are noisy versions of (x_i,p(x_i)). More precisely, each x_i is sampled independently from some distribution on [-1,1]^n, and for each i independently, y_i is arbitrary (i.e., an outlier) with probability at most < 1/2, and otherwise satisfies |y_i-p(x_i)|. The goal is to output a polynomial p, of degree at most d in each variable, within an _-distance of at most O() from p. Kane, Karmalkar, and Price [FOCS'17] solved this problem for n=1. We generalize their results to the n-variate setting, showing an algorithm that achieves a sample complexity of O_n(d^n d), where the hidden constant depends on n, if is the n-dimensional Chebyshev distribution. The sample complexity is O_n(d^2n d), if the samples are drawn from the uniform distribution instead. The approximation error is guaranteed to be at most O(), and the run-time depends on (1/). In the setting where each x_i and y_i are known up to N bits of precision, the run-time's dependence on N is linear. We also show that our sample complexities are optimal in terms of d^n. Furthermore, we show that it is possible to have the run-time be independent of 1/, at the cost of a higher sample complexity.