Data Structures for Density Estimation
Anders Aamand, Alexandr Andoni, Justin Y. Chen, Piotr Indyk, Shyam Narayanan, Sandeep Silwal
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/justc2/datastructdensityestOfficialIn papernone★ 1
Abstract
We study statistical/computational tradeoffs for the following density estimation problem: given k distributions v_1, , v_k over a discrete domain of size n, and sampling access to a distribution p, identify v_i that is "close" to p. Our main result is the first data structure that, given a sublinear (in n) number of samples from p, identifies v_i in time sublinear in k. We also give an improved version of the algorithm of Acharya et al. (2018) that reports v_i in time linear in k. The experimental evaluation of the latter algorithm shows that it achieves a significant reduction in the number of operations needed to achieve a given accuracy compared to prior work.