SOTAVerified

Distributional Matrix Completion via Nearest Neighbors in the Wasserstein Space

2024-10-17Code Available0· sign in to hype

Jacob Feitelberg, Kyuseong Choi, Anish Agarwal, Raaz Dwivedi

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

We study the problem of distributional matrix completion: Given a sparsely observed matrix of empirical distributions, we seek to impute the true distributions associated with both observed and unobserved matrix entries. This is a generalization of traditional matrix completion, where the observations per matrix entry are scalar-valued. To do so, we utilize tools from optimal transport to generalize the nearest neighbors method to the distributional setting. Under a suitable latent factor model on probability distributions, we establish that our method recovers the distributions in the Wasserstein metric. We demonstrate through simulations that our method (i) provides better distributional estimates for an entry compared to using observed samples for that entry alone, (ii) yields accurate estimates of distributional quantities such as standard deviation and value-at-risk, and (iii) inherently supports heteroscedastic distributions. In addition, we demonstrate our method on a real-world dataset of quarterly earnings prediction distributions. We also prove novel asymptotic results for Wasserstein barycenters over one-dimensional distributions.

Tasks

Reproductions