Blind Regression: Nonparametric Regression for Latent Variable Models via Collaborative Filtering
Dogyoon Song, Christina E. Lee, Yihua Li, Devavrat Shah
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
We introduce the framework of blind regression motivated by matrix completion for recommendation systems: given m users, n movies, and a subset of user-movie ratings, the goal is to predict the unobserved user-movie ratings given the data, i.e., to complete the partially observed matrix. Following the framework of non-parametric statistics, we posit that user u and movie i have features x_1(u) and x_2(i) respectively, and their corresponding rating y(u,i) is a noisy measurement of f(x_1(u), x_2(i)) for some unknown function f. In contrast with classical regression, the features x = (x_1(u), x_2(i)) are not observed, making it challenging to apply standard regression methods to predict the unobserved ratings. Inspired by the classical Taylor's expansion for differentiable functions, we provide a prediction algorithm that is consistent for all Lipschitz functions. In fact, the analysis through our framework naturally leads to a variant of collaborative filtering, shedding insight into the widespread success of collaborative filtering in practice. Assuming each entry is sampled independently with probability at least (m^-1+,n^-1/2+) with > 0, we prove that the expected fraction of our estimates with error greater than is less than ^2 / ^2 plus a polynomially decaying term, where ^2 is the variance of the additive entry-wise noise term. Experiments with the MovieLens and Netflix datasets suggest that our algorithm provides principled improvements over basic collaborative filtering and is competitive with matrix factorization methods.