Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram

2019-10-25Code Available2· sign in to hype

Ryuichi Yamamoto, Eunwoo Song, Jae-Min Kim

Code Available — Be the first to reproduce this paper.

Code

github.com/facebookresearch/denoiser
pytorch★ 1,882
github.com/bigpon/vcc20_baseline_cyclevae
pytorch★ 131
github.com/bigpon/QPPWG
pytorch★ 46
github.com/mukeshv0/ParallelWaveGAN
pytorch★ 0
github.com/yanggeng1995/GAN-TTS
pytorch★ 0
github.com/Moon-sung-woo/ParallelWaveGan_korean
pytorch★ 0
github.com/yanggeng1995/FB-MelGAN
pytorch★ 0
github.com/deciding/ParallelWaveGAN
pytorch★ 0
github.com/MindSpore-paper-code-2/code399/tree/main/wavenet
mindspore★ 0

Abstract

We propose Parallel WaveGAN, a distillation-free, fast, and small-footprint waveform generation method using a generative adversarial network. In the proposed method, a non-autoregressive WaveNet is trained by jointly optimizing multi-resolution spectrogram and adversarial loss functions, which can effectively capture the time-frequency distribution of the realistic speech waveform. As our method does not require density distillation used in the conventional teacher-student framework, the entire model can be easily trained. Furthermore, our model is able to generate high-fidelity speech even with its compact architecture. In particular, the proposed Parallel WaveGAN has only 1.44 M parameters and can generate 24 kHz speech waveform 28.68 times faster than real-time on a single GPU environment. Perceptual listening test results verify that our proposed method achieves 4.16 mean opinion score within a Transformer-based text-to-speech framework, which is comparative to the best distillation-based Parallel WaveNet system.

Tasks

Generative Adversarial Network GPU Speech Synthesis text-to-speech Text to Speech Text-To-Speech Synthesis

Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram

Code

Abstract

Tasks

Reproductions