Learning to Remember Rare Events

2017-03-09Code Available0· sign in to hype

Łukasz Kaiser, Ofir Nachum, Aurko Roy, Samy Bengio

Code Available — Be the first to reproduce this paper.

Code

github.com/tensorflow/models
OfficialIn papertf★ 77,694
github.com/rdspring1/lsh_deeplearning
pytorch★ 0

Abstract

Despite recent advances, memory-augmented deep neural networks are still limited when it comes to life-long and one-shot learning, especially in remembering rare events. We present a large-scale life-long memory module for use in deep learning. The module exploits fast nearest-neighbor algorithms for efficiency and thus scales to large memory sizes. Except for the nearest-neighbor query, the module is fully differentiable and trained end-to-end with no extra supervision. It operates in a life-long manner, i.e., without the need to reset it during training. Our memory module can be easily added to any part of a supervised neural network. To show its versatility we add it to a number of networks, from simple convolutional ones tested on image classification to deep sequence-to-sequence and recurrent-convolutional models. In all cases, the enhanced network gains the ability to remember and do life-long one-shot learning. Our module remembers training examples shown many thousands of steps in the past and it can successfully generalize from them. We set new state-of-the-art for one-shot learning on the Omniglot dataset and demonstrate, for the first time, life-long one-shot learning in recurrent neural networks on a large-scale machine translation task.

Tasks

Few-Shot Image Classification image-classification Image Classification Machine Translation One-Shot Learning

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
OMNIGLOT - 1-Shot, 20-way	ConvNet with Memory Module	Accuracy	95	—	Unverified
OMNIGLOT - 1-Shot, 5-way	ConvNet with Memory Module	Accuracy	98.4	—	Unverified
OMNIGLOT - 5-Shot, 20-way	ConvNet with Memory Module	Accuracy	98.6	—	Unverified
OMNIGLOT - 5-Shot, 5-way	ConvNet with Memory Module	Accuracy	99.6	—	Unverified

Learning to Remember Rare Events

Code

Abstract

Tasks

Benchmark Results

Reproductions