SOTAVerified

Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Showing 651675 of 1419 papers

TitleStatusHype
A Survey on Audio Diffusion Models: Text To Speech Synthesis and Enhancement in Generative AI0
Code-Switching Text Generation and Injection in Mandarin-English ASR0
Cross-speaker Emotion Transfer by Manipulating Speech Style Latents0
Controllable Prosody Generation With Partial Inputs0
QI-TTS: Questioning Intonation Control for Emotional Speech Synthesis0
An End-to-End Neural Network for Image-to-Audio Transformation0
Text-to-ECG: 12-Lead Electrocardiogram Synthesis conditioned on Clinical Text ReportsCode0
Do Prosody Transfer Models Transfer Prosody?0
Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language ModelingCode5
FoundationTTS: Text-to-Speech for ASR Customization with Generative Language Model0
Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text RepresentationsCode1
Evaluating Parameter-Efficient Transfer Learning Approaches on SURE Benchmark for Speech UnderstandingCode1
Fine-grained Emotional Control of Text-To-Speech: Learning To Rank Inter- And Intra-Class Emotion Intensities0
LiteG2P: A fast, light and high accuracy model for grapheme-to-phoneme conversion0
Leveraging Large Text Corpora for End-to-End Speech Summarization0
ParrotTTS: Text-to-Speech synthesis by exploiting self-supervised representations0
DTW-SiameseNet: Dynamic Time Warped Siamese Network for Mispronunciation Detection and Correction0
ClArTTS: An Open-Source Classical Arabic Text-to-Speech Corpus0
UniFLG: Unified Facial Landmark Generator from Text or Speech0
Automatic Heteronym Resolution Pipeline Using RAD-TTS Aligners0
CrossSpeech: Speaker-independent Acoustic Representation for Cross-lingual Speech Synthesis0
Duration-aware pause insertion using pre-trained language model for multi-speaker text-to-speech0
Imaginary Voice: Face-styled Diffusion Model for Text-to-SpeechCode1
Varianceflow: High-Quality and Controllable Text-to-Speech using Variance Information via Normalizing Flow0
PITS: Variational Pitch Inference without Fundamental Frequency for End-to-End Pitch-controllable TTSCode2
Show:102550
← PrevPage 27 of 57Next →

No leaderboard results yet.