SOTAVerified

Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Showing 126150 of 1419 papers

TitleStatusHype
Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech SynthesisCode1
Mitigating Unauthorized Speech Synthesis for Voice ProtectionCode1
Enhancing Speech Intelligibility in Text-To-Speech Synthesis using Speaking Style ConversionCode1
End-to-end Lyrics Alignment for Polyphonic Music Using an Audio-to-Character Recognition ModelCode1
End-to-End Adversarial Text-to-SpeechCode1
End to End Lip Synchronization with a Temporal AutoEncoderCode1
Emotion-Aware Prosodic Phrasing for Expressive Text-to-SpeechCode1
Learning to Speak from Text: Zero-Shot Multilingual Text-to-Speech with Unsupervised Text PretrainingCode1
A Character-level Span-based Model for Mandarin Prosodic Structure PredictionCode1
ShiftySpeech: A Large-Scale Synthetic Speech Dataset with Distribution ShiftsCode1
EMNS /Imz/ Corpus: An emotive single-speaker dataset for narrative storytelling in games, television and graphic novelsCode1
KazEmoTTS: A Dataset for Kazakh Emotional Text-to-Speech SynthesisCode1
From Tens of Hours to Tens of Thousands: Scaling Back-Translation for Speech RecognitionCode1
Laugh Now Cry Later: Controlling Time-Varying Emotional States of Flow-Matching-Based Zero-Shot Text-to-SpeechCode1
Learning Arousal-Valence Representation from Categorical Emotion Labels of SpeechCode1
EfficientSpeech: An On-Device Text to Speech ModelCode1
EmoSpeech: Guiding FastSpeech2 Towards Emotional Text to SpeechCode1
Efficiently Trainable Text-to-Speech System Based on Deep Convolutional Networks with Guided AttentionCode1
Accurate Emotion Strength Assessment for Seen and Unseen Speech Based on Data-Driven Deep LearningCode1
ESPnet-SLU: Advancing Spoken Language Understanding through ESPnetCode1
Learning to Dub Movies via Hierarchical Prosody ModelsCode1
LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture SearchCode1
Mixer-TTS: non-autoregressive, fast and compact text-to-speech model conditioned on language model embeddingsCode1
E2 TTS: Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTSCode1
InstructTTSEval: Benchmarking Complex Natural-Language Instruction Following in Text-to-Speech SystemsCode1
Show:102550
← PrevPage 6 of 57Next →

No leaderboard results yet.