Integral Pruning on Activations and Weights for Efficient Neural Networks

2019-05-01ICLR 2019Unverified0· sign in to hype

Qing Yang, Wei Wen, Zuoguan Wang, Yiran Chen, Hai Li

Unverified — Be the first to reproduce this paper.

Abstract

With the rapidly scaling up of deep neural networks (DNNs), extensive research studies on network model compression such as weight pruning have been performed for efficient deployment. This work aims to advance the compression beyond the weights to the activations of DNNs. We propose the Integral Pruning (IP) technique which integrates the activation pruning with the weight pruning. Through the learning on the different importance of neuron responses and connections, the generated network, namely IPnet, balances the sparsity between activations and weights and therefore further improves execution efficiency. The feasibility and effectiveness of IPnet are thoroughly evaluated through various network models with different activation functions and on different datasets. With <0.5% disturbance on the testing accuracy, IPnet saves 71.1% ~ 96.35% of computation cost, compared to the original dense models with up to 5.8x and 10x reductions in activation and weight numbers, respectively.

Tasks

Model Compression

Integral Pruning on Activations and Weights for Efficient Neural Networks

Abstract

Tasks

Reproductions