Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 51–75 of 1419 papers

Title	Date	Tasks	Status	Hype
TokenSynth: A Token-based Neural Synthesizer for Instrument Cloning and Text-to-Instrument	Feb 13, 2025	Audio GenerationDecoder	CodeCode Available	2
RingFormer: A Neural Vocoder with Ring Attention and Convolution-Augmented Transformer	Jan 2, 2025	Audio Generationtext-to-speech	CodeCode Available	2
EmoSphere++: Emotion-Controllable Zero-Shot Text-to-Speech via Emotion-Adaptive Spherical Vector	Nov 4, 2024	DecoderEmotional Speech Synthesis	CodeCode Available	2
Lina-Speech: Gated Linear Attention is a Fast and Parameter-Efficient Learner for text-to-speech synthesis	Oct 30, 2024	Speech Synthesistext-to-speech	CodeCode Available	2
Audio Deepfake Detection with Self-Supervised XLS-R and SLS Classifier	Oct 28, 2024	Audio Deepfake DetectionAudio Generation	CodeCode Available	2
EmoKnob: Enhance Voice Cloning with Fine-Grained Emotion Control	Oct 1, 2024	Emotional Speech SynthesisSpeech Synthesis	CodeCode Available	2
Recent Advances in Speech Language Models: A Survey	Oct 1, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	2
SafeEar: Content Privacy-Preserving Audio Deepfake Detection	Sep 14, 2024	Audio Deepfake DetectionDeepFake Detection	CodeCode Available	2
SSR-Speech: Towards Stable, Safe and Robust Zero-shot Text-based Speech Editing and Synthesis	Sep 11, 2024	DecoderSpeech Synthesis	CodeCode Available	2
IndicVoices-R: Unlocking a Massive Multilingual Multi-speaker Speech Corpus for Scaling Indian TTS	Sep 9, 2024	DenoisingSpeech Enhancement	CodeCode Available	2
Sample-Efficient Diffusion for Text-To-Speech Synthesis	Sep 1, 2024	Language ModelingLanguage Modelling	CodeCode Available	2
TTSDS -- Text-to-Speech Distribution Score	Jul 17, 2024	text-to-speechText to Speech	CodeCode Available	2
CATT: Character-based Arabic Tashkeel Transformer	Jul 3, 2024	Arabic Text DiacritizationDecoder	CodeCode Available	2
DEX-TTS: Diffusion-based EXpressive Text-to-Speech with Style Modeling on Time Variability	Jun 27, 2024	Speech Synthesistext-to-speech	CodeCode Available	2
DiTTo-TTS: Diffusion Transformers for Scalable Text-to-Speech without Domain-Specific Factors	Jun 17, 2024	text-to-speechText to Speech	CodeCode Available	2
EmoSphere-TTS: Emotional Style and Intensity Modeling via Spherical Emotion Vector for Controllable Emotional Text-to-Speech	Jun 12, 2024	Emotional Speech Synthesistext-to-speech	CodeCode Available	2
LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning	Jun 12, 2024	text-to-speechText to Speech	CodeCode Available	2
WenetSpeech4TTS: A 12,800-hour Mandarin TTS Corpus for Large Speech Generation Model Benchmark	Jun 9, 2024	text-to-speechText to Speech	CodeCode Available	2
Small-E: Small Language Model with Linear Attention for Efficient Speech Synthesis	Jun 6, 2024	DecoderInductive Bias	CodeCode Available	2
TransVIP: Speech to Speech Translation System with Voice and Isochrony Preservation	May 28, 2024	Machine Translationspeech-recognition	CodeCode Available	2
Llama-VITS: Enhancing TTS Synthesis with Semantic Awareness	Apr 10, 2024	Speech Synthesistext-to-speech	CodeCode Available	2
CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations	Apr 10, 2024	Dialogue Generationtext-to-speech	CodeCode Available	2
CM-TTS: Enhancing Real Time Text-to-Speech Synthesis Efficiency through Weighted Samplers and Consistency Models	Mar 31, 2024	DenoisingSpeech Synthesis	CodeCode Available	2
An Automated End-to-End Open-Source Software for High-Quality Text-to-Speech Dataset Generation	Feb 26, 2024	Dataset Generationtext-to-speech	CodeCode Available	2
Paralinguistics-Aware Speech-Empowered Large Language Models for Natural Conversation	Feb 8, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	2

Show:10 25 50

← PrevPage 3 of 57Next →

No leaderboard results yet.