Fre-GAN: Adversarial Frequency-consistent Audio Synthesis

2021-06-04Code Available1· sign in to hype

Ji-Hoon Kim, Sang-Hoon Lee, Ji-Hyun Lee, Seong-Whan Lee

Code Available — Be the first to reproduce this paper.

Code

github.com/rishikksh20/Fre-GAN-pytorch
pytorch★ 112
github.com/chldkato/Fre-GAN-pytorch
pytorch★ 9

Abstract

Although recent works on neural vocoder have improved the quality of synthesized audio, there still exists a gap between generated and ground-truth audio in frequency space. This difference leads to spectral artifacts such as hissing noise or reverberation, and thus degrades the sample quality. In this paper, we propose Fre-GAN which achieves frequency-consistent audio synthesis with highly improved generation quality. Specifically, we first present resolution-connected generator and resolution-wise discriminators, which help learn various scales of spectral distributions over multiple frequency bands. Additionally, to reproduce high-frequency components accurately, we leverage discrete wavelet transform in the discriminators. From our experiments, Fre-GAN achieves high-fidelity waveform generation with a gap of only 0.03 MOS compared to ground-truth audio while outperforming standard models in quality.

Tasks

Audio Synthesis

Fre-GAN: Adversarial Frequency-consistent Audio Synthesis

Code

Abstract

Tasks

Reproductions