Active clustering with bandit feedback

2024-06-17Unverified0· sign in to hype

Victor Thuot, Alexandra Carpentier, Christophe Giraud, Nicolas Verzelen

Unverified — Be the first to reproduce this paper.

Abstract

We investigate the Active Clustering Problem (ACP). A learner interacts with an N-armed stochastic bandit with d-dimensional subGaussian feedback. There exists a hidden partition of the arms into K groups, such that arms within the same group, share the same mean vector. The learner's task is to uncover this hidden partition with the smallest budget - i.e., the least number of observation - and with a probability of error smaller than a prescribed constant . In this paper, (i) we derive a non-asymptotic lower bound for the budget, and (ii) we introduce the computationally efficient ACB algorithm, whose budget matches the lower bound in most regimes. We improve on the performance of a uniform sampling strategy. Importantly, contrary to the batch setting, we establish that there is no computation-information gap in the active setting.

Tasks

Clustering

Active clustering with bandit feedback

Abstract

Tasks

Reproductions