One-shot learning and big data with n=2
Lee H. Dicker, Dean P. Foster
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
We model a one-shot learning" situation, where very few (scalar) observations y_1,...,y_n are available. Associated with each observation y_i is a very high-dimensional vector x_i, which provides context for y_i and enables us to predict subsequent observations, given their own context. One of the salient features of our analysis is that the problems studied here are easier when the dimension of x_i is large; in other words, prediction becomes easier when more context is provided. The proposed methodology is a variant of principal component regression (PCR). Our rigorous analysis sheds new light on PCR. For instance, we show that classical PCR estimators may be inconsistent in the specified setting, unless they are multiplied by a scalar c > 1; that is, unless the classical estimator is expanded. This expansion phenomenon appears to be somewhat novel and contrasts with shrinkage methods (c < 1), which are far more common in big data analyses. "