On the Peril of (Even a Little) Nonstationarity in Satisficing Regret Minimization

2026-03-19Unverified0· sign in to hype

Yixuan Zhang, Ruihao Zhu, Qiaomin Xie

Unverified — Be the first to reproduce this paper.

Abstract

Motivated by the principle of satisficing in decision-making, we study satisficing regret guarantees for nonstationary K-armed bandits. We show that in the general realizable, piecewise-stationary setting with L stationary segments, the optimal regret is Θ(L T) as long as L 2. This stands in sharp contrast to the case of L=1 (i.e., the stationary setting), where a T-independent Θ(1) satisficing regret is achievable under realizability. In other words, the optimal regret has to scale with T even if just a little nonstationarity presents. A key ingredient in our analysis is a novel Fano-based framework tailored to nonstationary bandits via a post-interaction reference construction. This framework strictly extends the classical Fano method for passive estimation as well as recent interactive Fano techniques for stationary bandits. As a complement, we also discuss a special regime in which constant satisficing regret is again possible.

On the Peril of (Even a Little) Nonstationarity in Satisficing Regret Minimization

Abstract

Reproductions