Efficiently Learning One-Hidden-Layer ReLU Networks via Schur Polynomials

2023-07-24Unverified0· sign in to hype

Ilias Diakonikolas, Daniel M. Kane

Unverified — Be the first to reproduce this paper.

Abstract

We study the problem of PAC learning a linear combination of k ReLU activations under the standard Gaussian distribution on R^d with respect to the square loss. Our main result is an efficient algorithm for this learning task with sample and computational complexity (dk/)^O(k), where >0 is the target accuracy. Prior work had given an algorithm for this problem with complexity (dk/)^h(k), where the function h(k) scales super-polynomially in k. Interestingly, the complexity of our algorithm is near-optimal within the class of Correlational Statistical Query algorithms. At a high-level, our algorithm uses tensor decomposition to identify a subspace such that all the O(k)-order moments are small in the orthogonal directions. Its analysis makes essential use of the theory of Schur polynomials to show that the higher-moment error tensors are small given that the lower-order ones are.

Tasks

PAC learning Tensor Decomposition

Efficiently Learning One-Hidden-Layer ReLU Networks via Schur Polynomials

Abstract

Tasks

Reproductions