Dimension-Free Bounds for Low-Precision Training

2019-05-01ICLR 2019Unverified0· sign in to hype

Zheng Li, Christopher De Sa

Unverified — Be the first to reproduce this paper.

Abstract

Low-precision training is a promising way of decreasing the time and energy cost of training machine learning models. Previous work has analyzed low-precision training algorithms, such as low-precision stochastic gradient descent, and derived theoretical bounds on their convergence rates. These bounds tend to depend on the dimension of the model d in that the number of bits needed to achieve a particular error bound increases as d increases. This is undesirable because a motivating application for low-precision training is large-scale models, such as deep learning, where d can be huge. In this paper, we prove dimension-independent bounds for low-precision training algorithms that use fixed-point arithmetic, which lets us better understand what affects the convergence of these algorithms as parameters scale. Our methods also generalize naturally to let us prove new convergence bounds on low-precision training with other quantization schemes, such as low-precision floating-point computation and logarithmic quantization.

Tasks

Quantization

Dimension-Free Bounds for Low-Precision Training

Abstract

Tasks

Reproductions