Generative Adversarial Training for Text-to-Speech Synthesis Based on Raw Phonetic Input and Explicit Prosody Modelling

2023-10-14Code Available2· sign in to hype

Tiberiu Boros, Stefan Daniel Dumitrescu, Ionut Mironica, Radu Chivereanu

Code Available — Be the first to reproduce this paper.

Code

github.com/tiberiu44/TTS-Cube
OfficialIn paperpytorch★ 223

Abstract

We describe an end-to-end speech synthesis system that uses generative adversarial training. We train our Vocoder for raw phoneme-to-audio conversion, using explicit phonetic, pitch and duration modeling. We experiment with several pre-trained models for contextualized and decontextualized word embeddings and we introduce a new method for highly expressive character voice matching, based on discreet style tokens.

Tasks

Speech Synthesis text-to-speech Text to Speech Text-To-Speech Synthesis Word Embeddings

Generative Adversarial Training for Text-to-Speech Synthesis Based on Raw Phonetic Input and Explicit Prosody Modelling

Code

Abstract

Tasks

Reproductions