An accuracy-runtime trade-off comparison of scalable Gaussian process approximations for spatial data
Filippo Rambelli, Fabio Sigrist
Code Available — Be the first to reproduce this paper.
ReproduceCode
Abstract
Gaussian processes (GPs) are flexible, probabilistic, non-parametric models widely employed in various fields such as spatial statistics and machine learning. A drawback of Gaussian processes is their computational cost having O(N^3) time and O(N^2) memory complexity which makes them prohibitive for large data sets. Numerous approximation techniques have been proposed to address this limitation. In this work, we systematically compare the accuracy of different Gaussian process approximations concerning likelihood evaluation, parameter estimation, and prediction taking into account the computational time required to perform these tasks. In other words, we analyze the trade-off between accuracy and runtime on multiple simulated and large-scale real-world data sets. We find that Vecchia approximations consistently emerge as the most accurate in almost all experiments.