Differentially Private Kernel Density Estimation

2024-09-03Unverified0· sign in to hype

Erzhi Liu, Jerry Yao-Chieh Hu, Alex Reneau, Zhao Song, Han Liu

Unverified — Be the first to reproduce this paper.

Abstract

We introduce a refined differentially private (DP) data structure for kernel density estimation (KDE), offering not only improved privacy-utility tradeoff but also better efficiency over prior results. Specifically, we study the mathematical problem: given a similarity function f (or DP KDE) and a private dataset X R^d, our goal is to preprocess X so that for any query yR^d, we approximate _x X f(x, y) in a differentially private fashion. The best previous algorithm for f(x,y) =\| x - y \|_1 is the node-contaminated balanced binary tree by [Backurs, Lin, Mahabadi, Silwal, and Tarnawski, ICLR 2024]. Their algorithm requires O(nd) space and time for preprocessing with n=|X|. For any query point, the query time is d n, with an error guarantee of (1+)-approximation and ^-1 ^-0.5 d^1.5 R ^1.5 n. In this paper, we improve the best previous result [Backurs, Lin, Mahabadi, Silwal, and Tarnawski, ICLR 2024] in three aspects: - We reduce query time by a factor of ^-1 n. - We improve the approximation ratio from to 1. - We reduce the error dependence by a factor of ^-0.5. From a technical perspective, our method of constructing the search tree differs from previous work [Backurs, Lin, Mahabadi, Silwal, and Tarnawski, ICLR 2024]. In prior work, for each query, the answer is split into ^-1 n numbers, each derived from the summation of n values in interval tree countings. In contrast, we construct the tree differently, splitting the answer into n numbers, where each is a smart combination of two distance values, two counting values, and y itself. We believe our tree structure may be of independent interest.

Tasks

Density Estimation

Differentially Private Kernel Density Estimation

Abstract

Tasks

Reproductions