The Gaussian Process Prior VAE for Interpretable Latent Dynamics from Pixels

2019-10-16pproximateinference AABI Symposium 2019Unverified0· sign in to hype

Michael Arthur Leopold Pearce

Unverified — Be the first to reproduce this paper.

Abstract

We consider the problem of unsupervised learning of a low dimensional, interpretable, latent state of a video containing a moving object. The problem of distilling dynamics from pixels has been extensively considered through the lens of graphical/state space models that exploit Markov structure for cheap computation and structured graphical model priors for enforcing interpretability on latent representations. We take a step towards extending these approaches by discarding the Markov structure; instead, repurposing the recently proposed Gaussian Process Prior Variational Autoencoder for learning sophisticated latent trajectories. We describe the model and perform experiments on a synthetic dataset and see that the model reliably reconstructs smooth dynamics exhibiting U-turns and loops. We also observe that this model may be trained without any beta-annealing or freeze-thaw of training parameters. Training is performed purely end-to-end on the unmodified evidence lower bound objective. This is in contrast to previous works, albeit for slightly different use cases, where application specific training tricks are often required.

Tasks

State Space Models

The Gaussian Process Prior VAE for Interpretable Latent Dynamics from Pixels

Abstract

Tasks

Reproductions