| Towards Spontaneous Style Modeling with Semi-supervised Pre-training for Conversational Text-to-Speech Synthesis | Aug 31, 2023 | Expressive Speech SynthesisSentence | —Unverified | 0 |
| DiffProsody: Diffusion-based Latent Prosody Generation for Expressive Speech Synthesis with Prosody Conditional Adversarial Training | Jul 31, 2023 | DenoisingExpressive Speech Synthesis | CodeCode Available | 1 |
| SC VALL-E: Style-Controllable Zero-Shot Text to Speech Synthesizer | Jul 20, 2023 | Expressive Speech SynthesisLanguage Modelling | CodeCode Available | 1 |
| Cross-lingual Prosody Transfer for Expressive Machine Dubbing | Jun 20, 2023 | Expressive Speech SynthesisSpeech Synthesis | —Unverified | 0 |
| EMNS /Imz/ Corpus: An emotive single-speaker dataset for narrative storytelling in games, television and graphic novels | May 22, 2023 | Expressive Speech SynthesisSpeech Synthesis | CodeCode Available | 1 |
| Enhancing Suno's Bark Text-to-Speech Model: Addressing Limitations Through Meta's Encodec and Pre-Trained Hubert | Apr 18, 2023 | Audio GenerationExpressive Speech Synthesis | CodeCode Available | 4 |
| Ensemble prosody prediction for expressive speech synthesis | Apr 3, 2023 | DiversityEnsemble Learning | —Unverified | 0 |
| On granularity of prosodic representations in expressive text-to-speech | Jan 26, 2023 | Expressive Speech SynthesisSpeech Synthesis | —Unverified | 0 |
| Multi-Speaker Expressive Speech Synthesis via Multiple Factors Decoupling | Nov 19, 2022 | Expressive Speech SynthesisSpeech Synthesis | —Unverified | 0 |
| Predicting phoneme-level prosody latents using AR and flow-based Prior Networks for expressive speech synthesis | Nov 2, 2022 | Expressive Speech SynthesisSpeech Synthesis | —Unverified | 0 |