SOTAVerified

Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Showing 5175 of 1419 papers

TitleStatusHype
TokenSynth: A Token-based Neural Synthesizer for Instrument Cloning and Text-to-InstrumentCode2
RingFormer: A Neural Vocoder with Ring Attention and Convolution-Augmented TransformerCode2
EmoSphere++: Emotion-Controllable Zero-Shot Text-to-Speech via Emotion-Adaptive Spherical VectorCode2
Lina-Speech: Gated Linear Attention is a Fast and Parameter-Efficient Learner for text-to-speech synthesisCode2
Audio Deepfake Detection with Self-Supervised XLS-R and SLS ClassifierCode2
EmoKnob: Enhance Voice Cloning with Fine-Grained Emotion ControlCode2
Recent Advances in Speech Language Models: A SurveyCode2
SafeEar: Content Privacy-Preserving Audio Deepfake DetectionCode2
SSR-Speech: Towards Stable, Safe and Robust Zero-shot Text-based Speech Editing and SynthesisCode2
IndicVoices-R: Unlocking a Massive Multilingual Multi-speaker Speech Corpus for Scaling Indian TTSCode2
Sample-Efficient Diffusion for Text-To-Speech SynthesisCode2
TTSDS -- Text-to-Speech Distribution ScoreCode2
CATT: Character-based Arabic Tashkeel TransformerCode2
DEX-TTS: Diffusion-based EXpressive Text-to-Speech with Style Modeling on Time VariabilityCode2
DiTTo-TTS: Diffusion Transformers for Scalable Text-to-Speech without Domain-Specific FactorsCode2
EmoSphere-TTS: Emotional Style and Intensity Modeling via Spherical Emotion Vector for Controllable Emotional Text-to-SpeechCode2
LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style CaptioningCode2
WenetSpeech4TTS: A 12,800-hour Mandarin TTS Corpus for Large Speech Generation Model BenchmarkCode2
Small-E: Small Language Model with Linear Attention for Efficient Speech SynthesisCode2
TransVIP: Speech to Speech Translation System with Voice and Isochrony PreservationCode2
Llama-VITS: Enhancing TTS Synthesis with Semantic AwarenessCode2
CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker ConversationsCode2
CM-TTS: Enhancing Real Time Text-to-Speech Synthesis Efficiency through Weighted Samplers and Consistency ModelsCode2
An Automated End-to-End Open-Source Software for High-Quality Text-to-Speech Dataset GenerationCode2
Paralinguistics-Aware Speech-Empowered Large Language Models for Natural ConversationCode2
Show:102550
← PrevPage 3 of 57Next →

No leaderboard results yet.