Near Neighbor: Who is the Fairest of Them All?

2019-06-06NeurIPS 2019Unverified0· sign in to hype

Sariel Har-Peled, Sepideh Mahabadi

Unverified — Be the first to reproduce this paper.

Abstract

BQSIn this work we study a fair variant of the near neighbor problem. Namely, given a set of n points P and a parameter r, the goal is to preprocess the points, such that given a query point q, any point in the r-neighborhood of the query, i.e., (q,r), have the same probability of being reported as the near neighbor. We show that LSH based algorithms can be made fair, without a significant loss in efficiency. Specifically, we show an algorithm that reports a point in the r-neighborhood of a query q with almost uniform probability. The query time is proportional to O( dns(q.r) (n,c) ), and its space is O((n,c)), where (n,c) and (n,c) are the query time and space of an LSH algorithm for c-approximate near neighbor, and dns(q,r) is a function of the local density around q. Our approach works more generally for sampling uniformly from a sub-collection of sets of a given collection and can be used in a few other applications. Finally, we run experiments to show performance of our approach on real data.

Tasks

All

Near Neighbor: Who is the Fairest of Them All?

Abstract

Tasks

Reproductions