Differentiable Sparsification for Deep Neural Networks

2021-05-21NeurIPS 2021Unverified0· sign in to hype

Yongjin Lee

Unverified — Be the first to reproduce this paper.

Abstract

Deep neural networks have relieved the feature engineering burden on human experts. However, comparable efforts are instead required to determine an effective architecture. In addition, as the sizes of networks have over-grown, a considerable amount of resources is also invested in reducing the sizes. The sparsification of an over-complete model addresses these problems as it removes redundant parameters or connections. In this study, we propose a fully differentiable sparsification method for deep neural networks, which allows parameters to be zero during training with the stochastic gradient descent. Thus, the proposed method can simultaneously learn the sparsified structure and weights of networks in an end-to-end manner, which can be directly applies to modern deep neural networks and imposes minimum overhead to the training process. To the authors' best knowledge, it is the first fully [sub-]differentiable sparsification method that zeroes out components, and it provides a foundation for future structure learning and model compression methods.

Tasks

Feature Engineering Model Compression

Differentiable Sparsification for Deep Neural Networks

Abstract

Tasks

Reproductions