SOTAVerified

Distributed Skellam Mechanism: a Novel Approach to Federated Learning with Differential Privacy

2021-09-29Unverified0· sign in to hype

Ergute Bao, Yizheng Zhu, Xiaokui Xiao, Yin Yang, Beng Chin Ooi, Benjamin Hong Meng Tan, Khin Mi Mi Aung

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

Deep neural networks have strong capabilities of memorizing the underlying training data; on the flip side, unintended data memorization can be a serious privacy concern. An effective and rigorous approach to addressing this problem is to train models with differential privacy (DP), which provides information-theoretic privacy guarantees by injecting random noise to the gradients. This paper focuses on the scenario where sensitive data are distributed among individual participants, who jointly train a model through federated learning, using both secure multiparty computation (MPC) to ensure the confidentiality of individual gradient updates, and differential privacy to avoid data leakage in the resulting model. We point out a major challenge in this problem setting: that common mechanisms for enforcing DP in deep learning, which require injecting real-valued noise, are fundamentally incompatible with MPC, which exchanges finite-field integers among the participants. Consequently, existing DP mechanisms require rather high noise levels, leading to poor model utility. Motivated by this, we design and develop distributed Skellam mechanism ( DSM), a novel solution for enforcing differential privacy on models built through an MPC-based federated learning process. Compared to existing approaches, DSM has the advantage that its privacy guarantee is independent of the dimensionality of the gradients; further, DSM allows tight privacy accounting due to the nice composition and sub-sampling properties of the Skellam distribution, which are key to accurate deep learning with DP. The theoretical analysis of DSM is highly non-trivial, especially considering (i) the complicated math of differentially private deep learning in general and (ii) the fact that the Skellam distribution is rather complex, and to our knowledge, has not been applied to an iterative and sampling-based process, i.e., stochastic gradient descent. Meanwhile, through extensive experiments on various practical settings, we demonstrate that DSM consistently outperforms existing solutions in terms of model utility by a large margin.

Tasks

Reproductions