SOTAVerified

Optimal Subsampling with Influence Functions

2018-12-01NeurIPS 2018Unverified0· sign in to hype

Daniel Ting, Eric Brochu

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

Subsampling is a common and often effective method to deal with the computational challenges of large datasets. However, for most statistical models, there is no well-motivated approach for drawing a non-uniform subsample. We show that the concept of an asymptotically linear estimator and the associated influence function leads to asymptotically optimal sampling probabilities for a wide class of popular models. This is the only tight optimality result for subsampling we are aware of as other methods only provide probabilistic error bounds or optimal rates. Furthermore, for linear regression models, which have well-studied procedures for non-uniform subsampling, we empirically show our optimal influence function based method outperforms previous approaches even when using approximations to the optimal probabilities.

Tasks

Reproductions