Algorithms and SQ Lower Bounds for Robustly Learning Real-valued Multi-index Models
Ilias Diakonikolas, Giannis Iakovidis, Daniel M. Kane, Lisheng Ren
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
We study the complexity of learning real-valued Multi-Index Models (MIMs) under the Gaussian distribution. A K-MIM is a function f:R^d R that depends only on the projection of its input onto a K-dimensional subspace. We give a general algorithm for PAC learning a broad class of MIMs with respect to the square loss, even in the presence of adversarial label noise. Moreover, we establish a nearly matching Statistical Query (SQ) lower bound, providing evidence that the complexity of our algorithm is qualitatively optimal as a function of the dimension. Specifically, we consider the class of bounded variation MIMs with the property that degree at most m distinguishing moments exist with respect to projections onto any subspace. In the presence of adversarial label noise, the complexity of our learning algorithm is d^O(m)2^poly(K/). For the realizable and independent noise settings, our algorithm incurs complexity d^O(m)2^poly(K)(1/)^O(K). To complement our upper bound, we show that if for some subspace degree-m distinguishing moments do not exist, then any SQ learner for the corresponding class of MIMs requires complexity d^(m). As an application, we give the first efficient learner for the class of positive-homogeneous L-Lipschitz K-MIMs. The resulting algorithm has complexity poly(d) 2^poly(KL/). This gives a new PAC learning algorithm for Lipschitz homogeneous ReLU networks with complexity independent of the network size, removing the exponential dependence incurred in prior work.