Robust Estimation under the Wasserstein Distance
Sloan Nietert, Rachel Cummings, Ziv Goldfeld
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/sbnietert/robust-otOfficialIn paperpytorch★ 8
Abstract
We study the problem of robust distribution estimation under the Wasserstein distance, a popular discrepancy measure between probability distributions rooted in optimal transport (OT) theory. Given n samples from an unknown distribution , of which n are adversarially corrupted, we seek an estimate for with minimal Wasserstein error. To address this task, we draw upon two frameworks from OT and robust statistics: partial OT (POT) and minimum distance estimation (MDE). We prove new structural properties for POT and use them to show that MDE under a partial Wasserstein distance achieves the minimax-optimal robust estimation risk in many settings. Along the way, we derive a novel dual form for POT that adds a sup-norm penalty to the classic Kantorovich dual for standard OT. Since the popular Wasserstein generative adversarial network (WGAN) framework implements Wasserstein MDE via Kantorovich duality, our penalized dual enables large-scale generative modeling with contaminated datasets via an elementary modification to WGAN. Numerical experiments demonstrating the efficacy of our approach in mitigating the impact of adversarial corruptions are provided.