On The Complexity of Best-Arm Identification in Non-Stationary Linear Bandits

2026-03-11Unverified0· sign in to hype

Leo Maynard-Zhang, Zhihan Xiong, Kevin Jamieson, Maryam Fazel

Unverified — Be the first to reproduce this paper.

Abstract

We study the fixed-budget best-arm identification (BAI) problem in non-stationary linear bandits. Concretely, given a fixed time budget T N, finite arm set X R^d, and a potentially adversarial sequence of unknown parameters θ_t_t=1^T (hence non-stationary), a learner aims to identify the arm with the largest cumulative reward x_* = _x X x^_t=1^T θ_t with high probability. In this setting, it is well-known that uniformly sampling arms from the G-optimal design yields a minimax-optimal error probability of (-Θ(T / H_G)), where H_G scales proportionally with the dimension d. However, this notion of complexity is overly pessimistic, as it is derived from a lower bound in which the arm set consists only of the standard basis vectors, thus masking any potential advantages arising from arm sets with richer geometric structure. To address this, we establish an arm-set-dependent lower bound that, in contrast, holds for any arm set. Motivated by the ideas underlying our lower bound, we propose the Adjacent-optimal design, a specialization of the well-known XY-optimal design, and develop the Adjacent-BAI algorithm. We prove that the error probability of Adjacent-BAI matches our lower bound up to constants, verifying the tightness of our lower bound, and establishing the arm-set-dependent complexity of this setting.

On The Complexity of Best-Arm Identification in Non-Stationary Linear Bandits

Abstract

Reproductions