Efficient kernelized bandit algorithms via exploration distributions

2025-06-11Unverified0· sign in to hype

Bingshan Hu, Zheng He, Danica J. Sutherland

Unverified — Be the first to reproduce this paper.

Abstract

We consider a kernelized bandit problem with a compact arm set X R^d and a fixed but unknown reward function f^* with a finite norm in some Reproducing Kernel Hilbert Space (RKHS). We propose a class of computationally efficient kernelized bandit algorithms, which we call GP-Generic, based on a novel concept: exploration distributions. This class of algorithms includes Upper Confidence Bound-based approaches as a special case, but also allows for a variety of randomized algorithms. With careful choice of exploration distribution, our proposed generic algorithm realizes a wide range of concrete algorithms that achieve O(_TT) regret bounds, where _T characterizes the RKHS complexity. This matches known results for UCB- and Thompson Sampling-based algorithms; we also show that in practice, randomization can yield better practical results.

Tasks

Thompson Sampling

Efficient kernelized bandit algorithms via exploration distributions

Abstract

Tasks

Reproductions