SOTAVerified

On the universality of deep learning

2020-12-01NeurIPS 2020Unverified0· sign in to hype

Emmanuel Abbe, Colin Sandon

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

This paper shows that deep learning, i.e., neural networks trained by SGD, can learn in polytime any function class that can be learned in polytime by some algorithm, including parities. This universal result is further shown to be robust, i.e., it holds under possibly poly-noise on the gradients, which gives a separation between deep learning and statistical query algorithms, as the latter are not comparably universal due to cases like parities. This also shows that SGD-based deep learning does not suffer from the limitations of the perceptron discussed by Minsky-Papert '69. The paper further complement this result with a lower-bound on the generalization error of descent algorithms, which implies in particular that the robust universality breaks down if the gradients are averaged over large enough batches of samples as in full-GD, rather than fewer samples as in SGD.

Tasks

Reproductions