Uncertainty Quantification From Scaling Laws in Deep Neural Networks
Ibrahim Elsharkawy, Yonatan Kahn, Benjamin Hooberman
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
Quantifying the uncertainty from machine learning analyses is critical to their use in the physical sciences. In this work we focus on uncertainty inherited from the initialization distribution of neural networks. We compute the mean _L and variance _L^2 of the test loss L for an ensemble of multi-layer perceptrons (MLPs) with neural tangent kernel (NTK) initialization in the infinite-width limit, and compare empirically to the results from finite-width networks for three example tasks: MNIST classification, CIFAR classification and calorimeter energy regression. We observe scaling laws as a function of training set size N_D for both _L and _L, but find that the coefficient of variation _L _L/_L becomes independent of N_D at both infinite and finite width for sufficiently large N_D. This implies that the coefficient of variation of a finite-width network may be approximated by its infinite-width value, and may in principle be calculable using finite-width perturbation theory.