SOTAVerified

Regularizing Trajectories to Mitigate Catastrophic Forgetting

2019-09-25Unverified0· sign in to hype

Paul Michel, Elisabeth Salesky, Graham Neubig

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

Regularization-based continual learning approaches generally prevent catastrophic forgetting by augmenting the training loss with an auxiliary objective. However in most practical optimization scenarios with noisy data and/or gradients, it is possible that stochastic gradient descent can inadvertently change critical parameters. In this paper, we argue for the importance of regularizing optimization trajectories directly. We derive a new co-natural gradient update rule for continual learning whereby the new task gradients are preconditioned with the empirical Fisher information of previously learnt tasks. We show that using the co-natural gradient systematically reduces forgetting in continual learning. Moreover, it helps combat overfitting when learning a new task in a low resource scenario.

Tasks

Reproductions