Efficient-VDVAE: Less is more

2022-03-25Code Available1· sign in to hype

Louay Hazami, Rayhane Mama, Ragavan Thurairatnam

Code Available — Be the first to reproduce this paper.

Code

github.com/Rayhane-mamah/Efficient-VDVAE
OfficialIn paperjax★ 199

Abstract

Hierarchical VAEs have emerged in recent years as a reliable option for maximum likelihood estimation. However, instability issues and demanding computational requirements have hindered research progress in the area. We present simple modifications to the Very Deep VAE to make it converge up to 2.6 faster, save up to 20 in memory load and improve stability during training. Despite these changes, our models achieve comparable or better negative log-likelihood performance than current state-of-the-art models on all 7 commonly used image datasets we evaluated on. We also make an argument against using 5-bit benchmarks as a way to measure hierarchical VAE's performance due to undesirable biases caused by the 5-bit quantization. Additionally, we empirically demonstrate that roughly 3\% of the hierarchical VAE's latent space dimensions is sufficient to encode most of the image information, without loss of performance, opening up the doors to efficiently leverage the hierarchical VAEs' latent space in downstream tasks. We release our source code and models at https://github.com/Rayhane-mamah/Efficient-VDVAE .

Tasks

Image Generation Quantization

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
Binarized MNIST	Efficient-VDVAE	nats	79.09	—	Unverified
CelebA 256x256	Efficient-VDVAE	bpd	0.51	—	Unverified
CelebA 64x64	Efficient-VDVAE	bits/dimension	1.83	—	Unverified
CelebA-HQ 1024x1024	Efficient-VDVAE	bits/dimension	1.01	—	Unverified
FFHQ 1024 x 1024	Efficient-VDVAE	bits/dimension	2.3	—	Unverified
FFHQ 256 x 256	Efficient-VDVAE (DINOv2)	FD	514.16	—	Unverified
FFHQ 256 x 256	Efficient-VDVAE	FID	34.88	—	Unverified

Efficient-VDVAE: Less is more

Code

Abstract

Tasks

Benchmark Results

Reproductions