Active Linear Regression for _p Norms and Beyond
Cameron Musco, Christopher Musco, David P. Woodruff, Taisuke Yasuda
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
We study active sampling algorithms for linear regression, which aim to query only a few entries of a target vector b R^n and output a near minimizer to _x R^d \|Ax-b\|, for a design matrix A R^n d and loss \|\|. For p norm regression for any 0<p<, we give an algorithm based on Lewis weight sampling outputting a (1+)-approximate solution using just O(d/^2) queries to b for p(0,1), O(d/) queries for 1<p<2, and O(d^p/2/^p) queries for 2<p<. For 0<p<2, our bounds are optimal up to log factors, settling the query complexity for this range. For 2<p<, our dependence on d is optimal, while our dependence on is off by at most , up to log factors. Our result resolves an open question of [CD21], who gave near optimal bounds for the 1 norm, but required d^2/^2 samples for _p regression with 1<p<2, and gave no bounds for 2<p< or 0<p<1. We also give the first total sensitivity bound of O(d^\1,p/2\^2n) for loss functions of degree p polynomial growth, improving a result of [TMF20]. By combining this with our techniques for _p regression, we obtain an active regression algorithm making O(d^1+\1,p/2\/poly()) queries for such loss functions, including the Tukey and Huber losses, answering another question of [CD21]. For the Huber loss, we further improve our bound to O(d^4-22/poly()) samples. Our sensitivity bounds also have many applications, including Orlicz norm subspace embeddings, robust subspace approximation, and dimension reduction for smoothed p-norms. Finally, our active sampling results give the first sublinear time algorithms for Kronecker product regression under every p norm.