Efficient Unbiased Sparsification
Leighton Barnes, Stephen Cameron, Timothy Chow, Emma Cohen, Keith Frankston, Benjamin Howard, Fred Kochman, Daniel Scheinerman, Jeffrey VanderKam
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
An unbiased m-sparsification of a vector p R^n is a random vector Q R^n with mean p that has at most m<n nonzero coordinates. Unbiased sparsification compresses the original vector without introducing bias; it arises in various contexts, such as in federated learning and sampling sparse probability distributions. Ideally, unbiased sparsification should also minimize the expected value of a divergence function Div(Q,p) that measures how far away Q is from the original p. If Q is optimal in this sense, then we call it efficient. Our main results describe efficient unbiased sparsifications for divergences that are either permutation-invariant or additively separable. Surprisingly, the characterization for permutation-invariant divergences is robust to the choice of divergence function, in the sense that our class of optimal Q for squared Euclidean distance coincides with our class of optimal Q for Kullback-Leibler divergence, or indeed any of a wide variety of divergences.