Learning to Quantize Deep Neural Networks: A Competitive-Collaborative Approach
Md Fahim Faysal Khan
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
Reducing the model size and computation costs for dedicated AI accelerator designs, neural network quantization methods have at- tracted momentous attention recently. Unfortunately, merely minimizing quantization loss using constant discretization causes accuracy deterio- ration. In this paper, we propose an iterative accuracy-driven learning framework of competitive-collaborative quantization (CCQ) to gradually adapt the bit-precision of each individual layer. Orthogonal to prior quantization policies working with full precision for the first and last layers of the network, CCQ offers layer-wise competition for any target quantization policy with holistic layer fine-tuning to recover accuracy, where the state-of-the-art networks can be entirely quantized without any significant accuracy degradation.