Simulation-based Bayesian Inference from Privacy Protected Data
Yifei Xiong, Nianqiao Phyllis Ju, Sanguo Zhang
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/yifei-xiong/simulation-based-bayesian-inference-from-privacy-protected-dataOfficialIn paperpytorch★ 0
Abstract
Many modern statistical analysis and machine learning applications require training models on sensitive user data. Under a formal definition of privacy protection, differentially private algorithms inject calibrated noise into the confidential data or during the data analysis process to produce privacy-protected datasets or queries. However, restricting access to only privatized data during statistical analysis makes it computationally challenging to make valid statistical inferences. In this work, we propose simulation-based inference methods from privacy-protected datasets. In addition to sequential Monte Carlo approximate Bayesian computation, we adopt neural conditional density estimators as a flexible family of distributions to approximate the posterior distribution of model parameters given the observed private query results. We illustrate our methods on discrete time-series data under an infectious disease model and with ordinary linear regression models. Illustrating the privacy-utility trade-off, our experiments and analysis demonstrate the necessity and feasibility of designing valid statistical inference procedures to correct for biases introduced by the privacy-protection mechanisms.