High Fidelity Image Synthesis With Deep VAEs In Latent Space

2023-03-23Code Available1· sign in to hype

Troy Luhman, Eric Luhman

Code Available — Be the first to reproduce this paper.

Code

github.com/ericl122333/latent-vae
OfficialIn paperpytorch★ 30
github.com/ericl122333/latent-vae-jax
OfficialIn paperjax★ 3

Abstract

We present fast, realistic image generation on high-resolution, multimodal datasets using hierarchical variational autoencoders (VAEs) trained on a deterministic autoencoder's latent space. In this two-stage setup, the autoencoder compresses the image into its semantic features, which are then modeled with a deep VAE. With this method, the VAE avoids modeling the fine-grained details that constitute the majority of the image's code length, allowing it to focus on learning its structural components. We demonstrate the effectiveness of our two-stage approach, achieving a FID of 9.34 on the ImageNet-256 dataset which is comparable to BigGAN. We make our implementation available online.

Tasks

Image Generation Vocal Bursts Intensity Prediction

High Fidelity Image Synthesis With Deep VAEs In Latent Space

Code

Abstract

Tasks

Reproductions