Lattice Quantization

2021-09-29Unverified0· sign in to hype

Clément Metz, Thibault Allenet, Johannes Christian Thiele, Antoine Dupret, Olivier Bichler

Unverified — Be the first to reproduce this paper.

Abstract

Low bit quantization of weights in increasingly large deep convolutional neural networks (DCNNs) can be critical for their implementation in memory constrained hardware systems. Post-training quantization consists in quantizing a model without retraining, which is user-friendly, fast and data frugal. In this paper, we propose LatticeQ, a new post-training weight quantization method designed for DCNNs. Instead of the standard scalar rounding widely used in state-of-the-art quantization methods, LatticeQ uses a quantizer based on lattices - discrete algebraic structures - which we show are able to exploit the inner correlations between the model parameters. LatticeQ allows us to achieve state-of-the-art results in post-training quantization, enabling us to approach full precision accuracies for bitwidths previously not accessible to post-training quantization methods. In particular, we achieve ImageNet classification results close to full precision on the popular Resnet-18/50, with only 0.5% and 5% accuracy drop for the 4-bit weights and 3-bit weights model architectures respectively.

Tasks

Quantization

Lattice Quantization

Abstract

Tasks

Reproductions