SOTAVerified

Expressive Speech Synthesis

Papers

Showing 147 of 47 papers

TitleStatusHype
NonverbalTTS: A Public English Corpus of Text-Aligned Nonverbal Vocalizations with Emotion Annotations for Text-to-Speech0
Prompt-Unseen-Emotion: Zero-shot Expressive Speech Synthesis with Prompt-LLM Contextual Knowledge for Mixed Emotions0
RASMALAI: Resources for Adaptive Speech Modeling in Indian Languages with Accents and Intonations0
Gender Bias in Instruction-Guided Speech Synthesis Models0
Speech Synthesis along Perceptual Voice Quality Dimensions0
MSceneSpeech: A Multi-Scene Speech Dataset For Expressive Speech Synthesis0
Rasa: Building Expressive Speech Synthesis Systems for Indian Languages in Low-resource SettingsCode1
Articulatory Phonetics Informed Controllable Expressive Speech SynthesisCode1
Expressivity and Speech Synthesis0
Boosting Multi-Speaker Expressive Speech Synthesis with Semi-supervised Contrastive Learning0
Towards Spontaneous Style Modeling with Semi-supervised Pre-training for Conversational Text-to-Speech Synthesis0
DiffProsody: Diffusion-based Latent Prosody Generation for Expressive Speech Synthesis with Prosody Conditional Adversarial TrainingCode1
SC VALL-E: Style-Controllable Zero-Shot Text to Speech SynthesizerCode1
Cross-lingual Prosody Transfer for Expressive Machine Dubbing0
EMNS /Imz/ Corpus: An emotive single-speaker dataset for narrative storytelling in games, television and graphic novelsCode1
Enhancing Suno's Bark Text-to-Speech Model: Addressing Limitations Through Meta's Encodec and Pre-Trained HubertCode4
Ensemble prosody prediction for expressive speech synthesis0
On granularity of prosodic representations in expressive text-to-speech0
Multi-Speaker Expressive Speech Synthesis via Multiple Factors Decoupling0
Predicting phoneme-level prosody latents using AR and flow-based Prior Networks for expressive speech synthesis0
Learning utterance-level representations through token-level acoustic latents prediction for Expressive Speech Synthesis0
Self-supervised Context-aware Style Representation for Expressive Speech Synthesis0
Fine-grained Noise Control for Multispeaker Speech Synthesis0
Towards Expressive Speaking Style Modelling with Hierarchical Context Information for Mandarin Speech Synthesis0
Word-Level Style Control for Expressive, Non-attentive Speech Synthesis0
Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised Training in Text-To-SpeechCode1
Referee: Towards reference-free cross-speaker style transfer with low-quality data for expressive speech synthesis0
Cross-speaker Style Transfer with Prosody Bottleneck in Neural Speech Synthesis0
UniTTS: Residual Learning of Unified Embedding Space for Speech Style Control0
Towards Multi-Scale Style Control for Expressive Speech Synthesis0
Sentiment Analysis for Emotional Speech Synthesis in a News Dialogue System0
Hierarchical Multi-Grained Generative Model for Expressive Speech Synthesis0
Laughter Synthesis: Combining Seq2seq modeling with Transfer LearningCode1
Using VAEs and Normalizing Flows for One-shot Text-To-Speech Synthesis of Expressive Speech0
The Theory behind Controllable Expressive Speech Synthesis: a Cross-disciplinary Approach0
Effective Use of Variational Embedding Capacity in Expressive End-to-End Speech SynthesisCode1
Visualization and Interpretation of Latent Spaces for Controlling Expressive Speech Synthesis through Audio AnalysisCode1
Exploring Transfer Learning for Low Resource Emotional TTSCode1
Robust and fine-grained prosody control of end-to-end speech synthesisCode0
SynPaFlex-Corpus: An Expressive French Audiobooks Corpus dedicated to expressive speech synthesis.0
Expressive Speech Synthesis via Modeling Expressions with Variational Autoencoder0
Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with TacotronCode1
Uncovering Latent Style Factors for Expressive Speech Synthesis0
Continuous Expressive Speaking Styles Synthesis based on CVSM and MR-HMM0
Combining Manual and Automatic Prosodic Annotation for Expressive Speech Synthesis0
Alert!... Calm Down, There is Nothing to Worry About. Warning and Soothing Speech Synthesis.0
Evaluating expressive speech synthesis from audiobook corpora for conversational phrases0
Show:102550

No leaderboard results yet.