An improved regret analysis for UCB-N and TS-N

2023-05-06Unverified0· sign in to hype

Nishant A. Mehta

Unverified — Be the first to reproduce this paper.

Abstract

In the setting of stochastic online learning with undirected feedback graphs, Lykouris et al. (2020) previously analyzed the pseudo-regret of the upper confidence bound-based algorithm UCB-N and the Thompson Sampling-based algorithm TS-N. In this note, we show how to improve their pseudo-regret analysis. Our improvement involves refining a key lemma of the previous analysis, allowing a (T) factor to be replaced by a factor _2() + 3 for the independence number of the feedback graph.

Tasks

LEMMA Thompson Sampling

An improved regret analysis for UCB-N and TS-N

Abstract

Tasks

Reproductions