SOTAVerified

Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Showing 5175 of 1419 papers

TitleStatusHype
DiffSinger: Singing Voice Synthesis via Shallow Diffusion MechanismCode2
Nix-TTS: Lightweight and End-to-End Text-to-Speech via Module-wise DistillationCode2
PAM: Prompting Audio-Language Models for Audio Quality AssessmentCode2
DurFlex-EVC: Duration-Flexible Emotional Voice Conversion Leveraging Discrete Representations without Text AlignmentCode2
Differentiable Reward Optimization for LLM based TTS systemCode2
DEX-TTS: Diffusion-based EXpressive Text-to-Speech with Style Modeling on Time VariabilityCode2
PITS: Variational Pitch Inference without Fundamental Frequency for End-to-End Pitch-controllable TTSCode2
Efficient Neural Audio SynthesisCode2
NaturalSpeech 2: Latent Diffusion Models are Natural and Zero-Shot Speech and Singing SynthesizersCode2
NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level QualityCode2
EmoSphere++: Emotion-Controllable Zero-Shot Text-to-Speech via Emotion-Adaptive Spherical VectorCode2
EmoSphere-TTS: Emotional Style and Intensity Modeling via Spherical Emotion Vector for Controllable Emotional Text-to-SpeechCode2
Neural Speech Synthesis with Transformer NetworkCode2
Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogramCode2
RWKVTTS: Yet another TTS based on RWKV-7Code2
CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker ConversationsCode2
LPCNet: Improving Neural Speech Synthesis Through Linear PredictionCode2
CM-TTS: Enhancing Real Time Text-to-Speech Synthesis Efficiency through Weighted Samplers and Consistency ModelsCode2
CoMoSpeech: One-Step Speech and Singing Voice Synthesis via Consistency ModelCode2
CATT: Character-based Arabic Tashkeel TransformerCode2
Lina-Speech: Gated Linear Attention is a Fast and Parameter-Efficient Learner for text-to-speech synthesisCode2
Llama-VITS: Enhancing TTS Synthesis with Semantic AwarenessCode2
LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPTCode2
iSTFTNet: Fast and Lightweight Mel-Spectrogram Vocoder Incorporating Inverse Short-Time Fourier TransformCode2
GenerSpeech: Towards Style Transfer for Generalizable Out-Of-Domain Text-to-SpeechCode2
Show:102550
← PrevPage 3 of 57Next →

No leaderboard results yet.