SOTAVerified

Neural tangent kernel eigenvalues accurately predict generalization

2021-09-29Unverified0· sign in to hype

James B Simon, Madeline Dickens, Michael Deweese

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

Finding a quantitative theory of neural network generalization has long been a central goal of deep learning research. We extend recent results to demonstrate that, by examining the eigensystem of a neural network's "neural tangent kernel," one can predict its generalization performance when learning arbitrary functions. Our theory accurately predicts not only test mean-squared-error but all first- and second-order statistics of the network's learned function. Furthermore, using a measure quantifying the "learnability" of a given target function, we prove a new "no free lunch" theorem characterizing a fundamental tradeoff in the inductive bias of wide neural networks: improving a network’s generalization for a given target function must worsen its generalization for orthogonal functions. We further demonstrate the utility of our theory by analytically predicting two surprising phenomena --- worse-than-chance generalization on hard-to-learn functions and nonmonotonic error curves in the small data regime --- which we subsequently observe in experiments. Though our theory is derived for infinite-width architectures, we find it agrees with networks as narrow as width 20, suggesting it is predictive of generalization in practical neural networks.

Tasks

Reproductions