Nonparametric MLE for Gaussian Location Mixtures: Certified Computation and Generic Behavior
Yury Polyanskiy, Mark Sellke
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
We study the nonparametric maximum likelihood estimator for Gaussian location mixtures in one dimension. It has been known since (Lindsay, 1983) that given an n-point dataset, this estimator always returns a mixture with at most n components, and more recently (Wu-Polyanskiy, 2020) gave a sharp O( n) bound for subgaussian data. In this work we study computational aspects of . We provide an algorithm which for small enough >0 computes an -approximation of in Wasserstein distance in time K+Cnk^2(1/). Here K is data-dependent but independent of , while C is an absolute constant and k=|supp()| n is the number of atoms in . We also certifiably compute the exact value of |supp()| in finite time. These guarantees hold almost surely whenever the dataset (x_1,,x_n) [-cn^1/4,cn^1/4] consists of independent points from a probability distribution with a density (relative to Lebesgue measure). We also show the distribution of conditioned to be k-atomic admits a density on the associated 2k-1 dimensional parameter space for all k n/3, and almost sure locally linear convergence of the EM algorithm. One key tool is a classical Fourier analytic estimate for non-degenerate curves.