Hamiltonian Deep Neural Networks Guaranteeing Non-vanishing Gradients by Design

2021-05-27Code Available1· sign in to hype

Clara Lucía Galimberti, Luca Furieri, Liang Xu, Giancarlo Ferrari-Trecate

Code Available — Be the first to reproduce this paper.

Code

github.com/DecodEPFL/HamiltonianNet
OfficialIn paperpytorch★ 23
github.com/decodepfl/deepdiscoph
pytorch★ 6
github.com/ClaraGalimberti/HamiltonianNet
pytorch★ 1

Abstract

Deep Neural Networks (DNNs) training can be difficult due to vanishing and exploding gradients during weight optimization through backpropagation. To address this problem, we propose a general class of Hamiltonian DNNs (H-DNNs) that stem from the discretization of continuous-time Hamiltonian systems and include several existing DNN architectures based on ordinary differential equations. Our main result is that a broad set of H-DNNs ensures non-vanishing gradients by design for an arbitrary network depth. This is obtained by proving that, using a semi-implicit Euler discretization scheme, the backward sensitivity matrices involved in gradient computations are symplectic. We also provide an upper-bound to the magnitude of sensitivity matrices and show that exploding gradients can be controlled through regularization. Finally, we enable distributed implementations of backward and forward propagation algorithms in H-DNNs by characterizing appropriate sparsity constraints on the weight matrices. The good performance of H-DNNs is demonstrated on benchmark classification problems, including image classification with the MNIST dataset.

Tasks

image-classification Image Classification Sensitivity

Hamiltonian Deep Neural Networks Guaranteeing Non-vanishing Gradients by Design

Code

Abstract

Tasks

Reproductions