UnICORNN: A recurrent model for learning very long time dependencies

2021-03-09Code Available1· sign in to hype

T. Konstantin Rusch, Siddhartha Mishra

Code Available — Be the first to reproduce this paper.

Code

github.com/tk-rusch/unicornn
OfficialIn paperpytorch★ 27

Abstract

The design of recurrent neural networks (RNNs) to accurately process sequential inputs with long-time dependencies is very challenging on account of the exploding and vanishing gradient problem. To overcome this, we propose a novel RNN architecture which is based on a structure preserving discretization of a Hamiltonian system of second-order ordinary differential equations that models networks of oscillators. The resulting RNN is fast, invertible (in time), memory efficient and we derive rigorous bounds on the hidden state gradients to prove the mitigation of the exploding and vanishing gradient problem. A suite of experiments are presented to demonstrate that the proposed RNN provides state of the art performance on a variety of learning tasks with (very) long-time dependencies.

Tasks

Sentiment Analysis Sequential Image Classification Time Series Analysis Time Series Classification

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
IMDb	UnICORNN	Accuracy	88.4	—	Unverified

UnICORNN: A recurrent model for learning very long time dependencies

Code

Abstract

Tasks

Benchmark Results

Reproductions