Extending the Scope of Inference About Predictive Ability to Machine Learning Methods
Juan Carlos Escanciano, Ricardo Parra
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
The use of machine learning methods for predictive purposes has increased dramatically over the past two decades, but uncertainty quantification for predictive comparisons remains elusive. This paper addresses this gap by extending the classic inference theory for predictive ability in time series to modern machine learners, such as the Lasso or Deep Learning. We investigate under which conditions such extensions are possible. For standard out-of-sample asymptotic inference to be valid with machine learning, two key properties must hold: (I) a zero-mean condition for the score of the prediction loss function and (ii) a "fast rate" of convergence for the machine learner. Absent any of these conditions, the estimation risk may be unbounded, and inferences invalid and very sensitive to sample splitting. For accurate inferences, we recommend an 80%-20% training-test splitting rule. We illustrate the wide applicability of our results with three applications: high-dimensional time series regressions with the Lasso, Deep learning for binary outcomes, and a new out-of-sample test for the Martingale Difference Hypothesis (MDH). The theoretical results are supported by extensive Monte Carlo simulations and an empirical application evaluating the MDH of some major exchange rates at daily and higher frequencies.