SOTAVerified

Identification of Mixtures of Discrete Product Distributions in Near-Optimal Sample and Time Complexity

2023-09-25Unverified0· sign in to hype

Spencer L. Gordon, Erik Jahn, Bijan Mazaheri, Yuval Rabani, Leonard J. Schulman

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

We consider the problem of identifying, from statistics, a distribution of discrete random variables X_1,,X_n that is a mixture of k product distributions. The best previous sample complexity for n O(k) was (1/)^O(k^2 k) (under a mild separation assumption parameterized by ). The best known lower bound was ((k)). It is known that n 2k-1 is necessary and sufficient for identification. We show, for any n 2k-1, how to achieve sample complexity and run-time complexity (1/)^O(k). We also extend the known lower bound of e^(k) to match our upper bound across a broad range of . Our results are obtained by combining (a) a classic method for robust tensor decomposition, (b) a novel way of bounding the condition number of key matrices called Hadamard extensions, by studying their action only on flattened rank-1 tensors.

Tasks

Reproductions