SOTAVerified

Recurrent Batch Normalization

2016-03-30Code Available0· sign in to hype

Tim Cooijmans, Nicolas Ballas, César Laurent, Çağlar Gülçehre, Aaron Courville

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

We propose a reparameterization of LSTM that brings the benefits of batch normalization to recurrent neural networks. Whereas previous works only apply batch normalization to the input-to-hidden transformation of RNNs, we demonstrate that it is both possible and beneficial to batch-normalize the hidden-to-hidden transition, thereby reducing internal covariate shift between time steps. We evaluate our proposal on various sequential problems such as sequence classification, language modeling and question answering. Our empirical results show that our batch-normalized LSTM consistently leads to faster convergence and improved generalization.

Tasks

Benchmark Results

DatasetModelMetricClaimedVerifiedStatus
Text8BN LSTMBit per Character (BPC)1.36Unverified

Reproductions