Wasserstein Training of Restricted Boltzmann Machines

2016-12-01NeurIPS 2016Unverified0· sign in to hype

Grégoire Montavon, Klaus-Robert Müller, Marco Cuturi

Unverified — Be the first to reproduce this paper.

Abstract

Boltzmann machines are able to learn highly complex, multimodal, structured and multiscale real-world data distributions. Parameters of the model are usually learned by minimizing the Kullback-Leibler (KL) divergence from training samples to the learned model. We propose in this work a novel approach for Boltzmann machine training which assumes that a meaningful metric between observations is known. This metric between observations can then be used to define the Wasserstein distance between the distribution induced by the Boltzmann machine on the one hand, and that given by the training sample on the other hand. We derive a gradient of that distance with respect to the model parameters. Minimization of this new objective leads to generative models with different statistical properties. We demonstrate their practical potential on data completion and denoising, for which the metric between observations plays a crucial role.

Tasks

Denoising

Wasserstein Training of Restricted Boltzmann Machines

Abstract

Tasks

Reproductions