Variational Dropout Sparsifies Deep Neural Networks

2017-01-19ICML 2017Code Available1· sign in to hype

Dmitry Molchanov, Arsenii Ashukha, Dmitry Vetrov

Code Available — Be the first to reproduce this paper.

Code

github.com/HolyBayes/pytorch_ard
pytorch★ 85
github.com/HolyBayes/VarDropPytorch
pytorch★ 85
github.com/ModelZoos/ModelZooDataset
pytorch★ 59
github.com/ars-ashuha/sparse-vd-pytorch
pytorch★ 10
github.com/Cerphilly/Sparse_VD_tf2
tf★ 1
github.com/Faptimus420/Sparse_VD_keras-core
jax★ 0
github.com/senya-ashukha/variational-dropout-sparsifies-dnn
tf★ 0
github.com/maxblumental/variational-drouput
pytorch★ 0
github.com/cbbjames/Variational-Dropout---ResNet-
none★ 0
github.com/KarenUllrich/Tutorial_BayesianCompressionForDL
pytorch★ 0

Abstract

We explore a recently proposed Variational Dropout technique that provided an elegant Bayesian interpretation to Gaussian Dropout. We extend Variational Dropout to the case when dropout rates are unbounded, propose a way to reduce the variance of the gradient estimator and report first experimental results with individual dropout rates per weight. Interestingly, it leads to extremely sparse solutions both in fully-connected and convolutional layers. This effect is similar to automatic relevance determination effect in empirical Bayes but has a number of advantages. We reduce the number of parameters up to 280 times on LeNet architectures and up to 68 times on VGG-like networks with a negligible decrease of accuracy.

Tasks

Sparse Learning

Variational Dropout Sparsifies Deep Neural Networks

Code

Abstract

Tasks

Reproductions