Provably and Practically Efficient Neural Contextual Bandits

2022-05-31Unverified0· sign in to hype

Sudeep Salgia, Sattar Vakili, Qing Zhao

Unverified — Be the first to reproduce this paper.

Abstract

We consider the neural contextual bandit problem. In contrast to the existing work which primarily focuses on ReLU neural nets, we consider a general set of smooth activation functions. Under this more general setting, (i) we derive non-asymptotic error bounds on the difference between an overparameterized neural net and its corresponding neural tangent kernel, (ii) we propose an algorithm with a provably sublinear regret bound that is also efficient in the finite regime as demonstrated by empirical studies. The non-asymptotic error bounds may be of broader interest as a tool to establish the relation between the smoothness of the activation functions in neural contextual bandits and the smoothness of the kernels in kernel bandits.

Tasks

Multi-Armed Bandits

Provably and Practically Efficient Neural Contextual Bandits

Abstract

Tasks

Reproductions