Linear Bandits on Uniformly Convex Sets

2021-03-10Unverified0· sign in to hype

Thomas Kerdreux, Christophe Roux, Alexandre d'Aspremont, Sebastian Pokutta

Unverified — Be the first to reproduce this paper.

Abstract

Linear bandit algorithms yield O(nT) pseudo-regret bounds on compact convex action sets KR^n and two types of structural assumptions lead to better pseudo-regret bounds. When K is the simplex or an _p ball with p]1,2], there exist bandits algorithms with O(nT) pseudo-regret bounds. Here, we derive bandit algorithms for some strongly convex sets beyond _p balls that enjoy pseudo-regret bounds of O(nT), which answers an open question from [BCB12, 5.5.]. Interestingly, when the action set is uniformly convex but not necessarily strongly convex, we obtain pseudo-regret bounds with a dimension dependency smaller than O(n). However, this comes at the expense of asymptotic rates in T varying between O(T) and O(T).

Tasks

Open-Ended Question Answering

Linear Bandits on Uniformly Convex Sets

Abstract

Tasks

Reproductions