Thompson Sampling under Bernoulli Rewards with Local Differential Privacy
2023-07-03Unverified0· sign in to hype
Bo Jiang, Tianchi Zhao, Ming Li
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
This paper investigates the problem of regret minimization for multi-armed bandit (MAB) problems with local differential privacy (LDP) guarantee. Given a fixed privacy budget , we consider three privatizing mechanisms under Bernoulli scenario: linear, quadratic and exponential mechanisms. Under each mechanism, we derive stochastic regret bound for Thompson Sampling algorithm. Finally, we simulate to illustrate the convergence of different mechanisms under different privacy budgets.