On Regret Bounds of Thompson Sampling for Bayesian Optimization
Shion Takeno, Shogo Iwazaki
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
We study a widely used Bayesian optimization method, Gaussian process Thompson sampling (GP-TS), under the assumption that the objective function is a sample path from a GP. Compared with the GP upper confidence bound (GP-UCB) with established high-probability and expected regret bounds, most analyses of GP-TS have been limited to expected regret. Moreover, whether the recent analyses of GP-UCB for the lenient regret and the improved cumulative regret upper bound can be applied to GP-TS remains unclear. To fill these gaps, this paper shows several regret bounds: (i) a regret lower bound for GP-TS, which implies that GP-TS suffers from a polynomial dependence on 1/δ with probability δ, (ii) an upper bound of the second moment of cumulative regret, which directly suggests an improved regret upper bound on δ, (iii) expected lenient regret upper bounds, and (iv) an improved cumulative regret upper bound on the time horizon T. Along the way, we provide several useful lemmas, including a relaxation of the necessary condition from recent analysis to obtain improved regret upper bounds on T.