SOTAVerified

Batch List-Decodable Linear Regression via Higher Moments

2025-03-12Unverified0· sign in to hype

Ilias Diakonikolas, Daniel M. Kane, Sushrut Karmalkar, Sihan Liu, Thanasis Pittas

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

We study the task of list-decodable linear regression using batches. A batch is called clean if it consists of i.i.d. samples from an unknown linear regression distribution. For a parameter (0, 1/2), an unknown -fraction of the batches are clean and no assumptions are made on the remaining ones. The goal is to output a small list of vectors at least one of which is close to the true regressor vector in _2-norm. [DJKS23] gave an efficient algorithm, under natural distributional assumptions, with the following guarantee. Assuming that the batch size n satisfies n (^-1) and the number of batches is m = poly(d, n, 1/), their algorithm runs in polynomial time and outputs a list of O(1/^2) vectors at least one of which is O(^-1/2/n) close to the target regressor. Here we design a new polynomial time algorithm with significantly stronger guarantees under the assumption that the low-degree moments of the covariates distribution are Sum-of-Squares (SoS) certifiably bounded. Specifically, for any constant >0, as long as the batch size is n _(^-) and the degree-(1/) moments of the covariates are SoS certifiably bounded, our algorithm uses m = poly((dn)^1/, 1/) batches, runs in polynomial-time, and outputs an O(1/)-sized list of vectors one of which is O(^-/2/n) close to the target. That is, our algorithm achieves substantially smaller minimum batch size and final error, while achieving the optimal list size. Our approach uses higher-order moment information by carefully combining the SoS paradigm interleaved with an iterative method and a novel list pruning procedure. In the process, we give an SoS proof of the Marcinkiewicz-Zygmund inequality that may be of broader applicability.

Tasks

Reproductions