Memory-Sample Tradeoffs for Linear Regression with Small Error
Vatsal Sharan, Aaron Sidford, Gregory Valiant
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
We consider the problem of performing linear regression over a stream of d-dimensional examples, and show that any algorithm that uses a subquadratic amount of memory exhibits a slower rate of convergence than can be achieved without memory constraints. Specifically, consider a sequence of labeled examples (a_1,b_1), (a_2,b_2), with a_i drawn independently from a d-dimensional isotropic Gaussian, and where b_i = a_i, x + _i, for a fixed x R^d with \|x\|_2 = 1 and with independent noise _i drawn uniformly from the interval [-2^-d/5,2^-d/5]. We show that any algorithm with at most d^2/4 bits of memory requires at least (d 1) samples to approximate x to _2 error with probability of success at least 2/3, for sufficiently small as a function of d. In contrast, for such , x can be recovered to error with probability 1-o(1) with memory O(d^2 (1/)) using d examples. This represents the first nontrivial lower bounds for regression with super-linear memory, and may open the door for strong memory/sample tradeoffs for continuous optimization.