Phase Transitions in Rate Distortion Theory and Deep Learning
Philipp Grohs, Andreas Klotz, Felix Voigtlaender
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
Rate distortion theory is concerned with optimally encoding a given signal class S using a budget of R bits, as R. We say that S can be compressed at rate s if we can achieve an error of O(R^-s) for encoding S; the supremal compression rate is denoted s^(S). Given a fixed coding scheme, there usually are elements of S that are compressed at a higher rate than s^(S) by the given coding scheme; we study the size of this set of signals. We show that for certain "nice" signal classes S, a phase transition occurs: We construct a probability measure P on S such that for every coding scheme C and any s >s^(S), the set of signals encoded with error O(R^-s) by C forms a P-null-set. In particular our results apply to balls in Besov and Sobolev spaces that embed compactly into L^2() for a bounded Lipschitz domain . As an application, we show that several existing sharpness results concerning function approximation using deep neural networks are generically sharp. We also provide quantitative and non-asymptotic bounds on the probability that a random fS can be encoded to within accuracy using R bits. This result is applied to the problem of approximately representing fS to within accuracy by a (quantized) neural network that is constrained to have at most W nonzero weights and is generated by an arbitrary "learning" procedure. We show that for any s >s^(S) there are constants c,C such that, no matter how we choose the "learning" procedure, the probability of success is bounded from above by \1,2^C W_2(1+W)^2 -c^-1/s\.