SOTAVerified

Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Showing 451500 of 1419 papers

TitleStatusHype
Grad-StyleSpeech: Any-speaker Adaptive Text-to-Speech Synthesis with Diffusion Models0
Emotion controllable speech synthesis using emotion-unlabeled dataset with the assistance of cross-domain speech emotion recognition0
EMOVIE: A Mandarin Emotion Speech Dataset with a Simple Emotional Text-to-Speech Model0
EmoVoice: LLM-based Emotional Text-To-Speech Model with Freestyle Text Prompting0
Empathic Machines: Using Intermediate Features as Levers to Emulate Emotions in Text-To-Speech Systems0
Empathic Machines: Using Intermediate Features as Levers to Emulate Emotions in Text-To-Speech Systems0
Emphasis control for parallel neural TTS0
Adversarial training of Keyword Spotting to Minimize TTS Data Overfitting0
Emphasized Accent Phrase Prediction from Text for Advertisement Text-To-Speech Synthesis0
Emphasizing Unseen Words: New Vocabulary Acquisition for End-to-End Speech Recognition0
EmoCat: Language-agnostic Emotional Voice Conversion0
Empowering Global Voices: A Data-Efficient, Phoneme-Tone Adaptive Approach to High-Fidelity Speech Synthesis0
Bridging the Gap: An Intermediate Language for Enhanced and Cost-Effective Grapheme-to-Phoneme Conversion with Homographs with Multiple Pronunciations Disambiguation0
BreezyVoice: Adapting TTS for Taiwanese Mandarin with Enhanced Polyphone Disambiguation -- Challenges and Insights0
AnyoneNet: Synchronized Speech and Talking Head Generation for Arbitrary Person0
End-to-End Feedback Loss in Speech Chain Framework via Straight-Through Estimator0
ELLA-V: Stable Neural Codec Language Modeling with Alignment-guided Sequence Reordering0
ELAICHI: Enhancing Low-resource TTS by Addressing Infrequent and Low-frequency Character Bigrams0
Braille-to-Speech Generator: Audio Generation Based on Joint Fine-Tuning of CLIP and Fastspeech20
End-to-end speech recognition modeling from de-identified data0
End-to-End Text-to-Speech Based on Latent Representation of Speaking Styles Using Spontaneous Dialogue0
End-to-end Text-to-speech for Low-resource Languages by Cross-Lingual Transfer Learning0
End-to-End Text-to-Speech using Latent Duration based on VQ-VAE0
Enhanced Direct Speech-to-Speech Translation Using Self-supervised Pre-training and Data Augmentation0
Enhancement of Pitch Controllability using Timbre-Preserving Pitch Augmentation in FastPitch0
Enhancing audio quality for expressive Neural Text-to-Speech0
Enhancing Crowdsourced Audio for Text-to-Speech Models0
Enhancing Low-Resource ASR through Versatile TTS: Bridging the Data Gap0
Efficient training strategies for natural sounding speech synthesis and speaker adaptation based on FastPitch0
Bootstrapping non-parallel voice conversion from speaker-adaptive text-to-speech0
Enhancing Speech-to-Speech Translation with Multiple TTS Targets0
Anti-Spoofing Using Transfer Learning with Variational Information Bottleneck0
Adversarial speech for voice privacy protection from Personalized Speech generation0
Enhancing the Stability of LLM-based Speech Generation Systems through Self-Supervised Representations0
Enhancing Zero-shot Text-to-Speech Synthesis with Human Feedback0
Ensemble prosody prediction for expressive speech synthesis0
Environment Aware Text-to-Speech Synthesis0
EPIC TTS Models: Empirical Pruning Investigations Characterizing Text-To-Speech Models0
A Comparative Analysis of Pretrained Language Models for Text-to-Speech0
Efficiently Trained Low-Resource Mongolian Text-to-Speech System Based On FullConv-TTS0
Bootstrap an end-to-end ASR system by multilingual training, transfer learning, text-to-text mapping and synthetic audio0
ESPnet2-TTS: Extending the Edge of TTS Research0
Efficient Incremental Text-to-Speech on GPUs0
ESPnet-ST: All-in-One Speech Translation Toolkit0
Boosting Large Language Model for Speech Synthesis: An Empirical Study0
Evaluating and Improving Automatic Speech Recognition Systems for Korean Meteorological Experts0
Evaluating and Personalizing User-Perceived Quality of Text-to-Speech Voices for Delivering Mindfulness Meditation with Different Physical Embodiments0
Evaluating and reducing the distance between synthetic and real speech distributions0
Evaluating Long-form Text-to-Speech: Comparing the Ratings of Sentences and Paragraphs0
An overview of text-to-speech systems and media applications0
Show:102550
← PrevPage 10 of 29Next →

No leaderboard results yet.