Integrated Gradient attribution for Gaussian Processes with non-Gaussian likelihoods
Sarem Seitz
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/SaremS/iggpOfficialpytorch★ 0
- github.com/MindSpore-scientific/code-10/tree/main/Integrated-Gradientmindspore★ 0
- github.com/MindSpore-scientific/code-11/tree/main/Integrated-Gradientmindspore★ 0
Abstract
Gaussian Processes (GPs) have proven themselves as a reliable and effective method in probabilistic machine learning. Thanks to recent and current advances, modelling complex data with GPs is becoming more and more feasible. Thus, these types of models are, nowadays, an interesting alternative to neural and deep learning methods. For the latter, we see an increasing interest in so-called explainability approaches - in essence methods that aim to make a machine learning model's decision process transparent to humans. Such methods are particularly needed when illogical or biased reasoning can lead to actual disadvantageous consequences for humans. Ideally, explainable machine learning can help detecting respective flaws in a model and aid in a subsequent debugging process. One active line of research in explainable machine learning are gradient-based methods which have been successfully applied to complex neural networks. Given that GPs are closed under differentiation, gradient-based explainability, and particularly the concept of Integrated Gradients, for GPs appears as a promising field of research. While GP regression models with Gaussian likelihoods allow for a relatively straightforward approach to derive Integrated Gradients, the matter is more complicated for GPs with non-Gaussian likelihoods. As the latter typically require non-linear transformations of the GP the resulting processes won't adhere to the theoretical amenities to derive Integrated Gradients. Thus, this paper is concerned with providing a way to calculate Integrated Gradients for such cases. We discuss several common link-functions and derive both closed-form and approximate results.