Algorithms for Differentially Private Multi-Armed Bandits

2015-11-27Unverified0· sign in to hype

Aristide Tossou, Christos Dimitrakakis

Unverified — Be the first to reproduce this paper.

Abstract

We present differentially private algorithms for the stochastic Multi-Armed Bandit (MAB) problem. This is a problem for applications such as adaptive clinical trials, experiment design, and user-targeted advertising where private information is connected to individual rewards. Our major contribution is to show that there exist (, ) differentially private variants of Upper Confidence Bound algorithms which have optimal regret, O(^-1 + T). This is a significant improvement over previous results, which only achieve poly-log regret O(^-2 ^2 T), because of our use of a novel interval-based mechanism. We also substantially improve the bounds of previous family of algorithms which use a continual release mechanism. Experiments clearly validate our theoretical bounds.

Tasks

Multi-Armed Bandits

Algorithms for Differentially Private Multi-Armed Bandits

Abstract

Tasks

Reproductions