Wavelet Diffusion Models are fast and scalable Image Generators

2022-11-29CVPR 2023Code Available2· sign in to hype

Hao Phung, Quan Dao, Anh Tran

Code Available — Be the first to reproduce this paper.

Code

github.com/vinairesearch/wavediff
OfficialIn paperpytorch★ 435

Abstract

Diffusion models are rising as a powerful solution for high-fidelity image generation, which exceeds GANs in quality in many circumstances. However, their slow training and inference speed is a huge bottleneck, blocking them from being used in real-time applications. A recent DiffusionGAN method significantly decreases the models' running time by reducing the number of sampling steps from thousands to several, but their speeds still largely lag behind the GAN counterparts. This paper aims to reduce the speed gap by proposing a novel wavelet-based diffusion scheme. We extract low-and-high frequency components from both image and feature levels via wavelet decomposition and adaptively handle these components for faster processing while maintaining good generation quality. Furthermore, we propose to use a reconstruction term, which effectively boosts the model training convergence. Experimental results on CelebA-HQ, CIFAR-10, LSUN-Church, and STL-10 datasets prove our solution is a stepping-stone to offering real-time and high-fidelity diffusion models. Our code and pre-trained checkpoints are available at https://github.com/VinAIResearch/WaveDiff.git.

Tasks

Blocking Image Generation

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
CelebA-HQ 1024x1024	WaveDiff	FID	5.98	—	Unverified
CelebA-HQ 256x256	WaveDiff	FID	5.94	—	Unverified
CelebA-HQ 512x512	WaveDiff	FID	6.4	—	Unverified
LSUN Churches 256 x 256	WaveDiff	FID	5.06	—	Unverified
STL-10	WaveDiff	FID	12.93	—	Unverified

Wavelet Diffusion Models are fast and scalable Image Generators

Code

Abstract

Tasks

Benchmark Results

Reproductions