Differential Privacy for Multi-armed Bandits: What Is It and What Is Its Cost?
2019-05-29Unverified0· sign in to hype
Debabrota Basu, Christos Dimitrakakis, Aristide Tossou
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
Based on differential privacy (DP) framework, we introduce and unify privacy definitions for the multi-armed bandit algorithms. We represent the framework with a unified graphical model and use it to connect privacy definitions. We derive and contrast lower bounds on the regret of bandit algorithms satisfying these definitions. We leverage a unified proving technique to achieve all the lower bounds. We show that for all of them, the learner's regret is increased by a multiplicative factor dependent on the privacy level . We observe that the dependency is weaker when we do not require local differential privacy for the rewards.