Neural Linear Bandits: Overcoming Catastrophic Forgetting through Likelihood Matching

2019-09-25Code Available0· sign in to hype

Tom Zahavy, Shie Mannor

Code Available — Be the first to reproduce this paper.

Code

github.com/anonymousneurips/neurips
OfficialIn papertf★ 0

Abstract

We study neural-linear bandits for solving problems where both exploration and representation learning play an important role. Neural-linear bandits leverage the representation power of deep neural networks and combine it with efficient exploration mechanisms, designed for linear contextual bandits, on top of the last hidden layer. Since the representation is being optimized during learning, information regarding exploration with "old" features is lost. Here, we propose the first limited memory neural-linear bandit that is resilient to this catastrophic forgetting phenomenon. We perform simulations on a variety of real-world problems, including regression, classification, and sentiment analysis, and observe that our algorithm achieves superior performance and shows resilience to catastrophic forgetting.

Tasks

Efficient Exploration Multi-Armed Bandits regression Representation Learning Sentiment Analysis

Neural Linear Bandits: Overcoming Catastrophic Forgetting through Likelihood Matching

Code

Abstract

Tasks

Reproductions