Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–25 of 1419 papers

Title	Date	Tasks	Status	Hype	Score
Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech Tokens	Mar 3, 2025	Attributetext-to-speech	CodeCode Available	11	5
IndexTTS: An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System	Feb 8, 2025	DecoderLanguage Modeling	CodeCode Available	11	5
F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching	Oct 9, 2024	Denoisingtext-to-speech	CodeCode Available	11	5
CosyVoice: A Scalable Multilingual Zero-shot Text-to-speech Synthesizer based on Supervised Semantic Tokens	Jul 7, 2024	Language ModellingLarge Language Model	CodeCode Available	11	5
Overview of the Amphion Toolkit (v0.2)	Jan 26, 2025	text-to-speechText to Speech	CodeCode Available	9	5
Moshi: a speech-text foundation model for real-time dialogue	Sep 17, 2024	Action DetectionActivity Detection	CodeCode Available	9	5
Metis: A Foundation Speech Generation Model with Masked Generative Pre-training	Feb 5, 2025	Self-Supervised LearningSpeech Enhancement	CodeCode Available	9	5
VoiceCraft: Zero-Shot Speech Editing and Text-to-Speech in the Wild	Mar 25, 2024	DecoderLanguage Modeling	CodeCode Available	9	5
Natural language guidance of high-fidelity text-to-speech with synthetic annotations	Feb 2, 2024	In-Context LearningLanguage Modeling	CodeCode Available	9	5
MaskGCT: Zero-Shot Text-to-Speech with Masked Generative Codec Transformer	Sep 1, 2024	Self-Supervised Learningtext-to-speech	CodeCode Available	9	5
Speechless: Speech Instruction Training Without Speech for Low Resource Languages	May 23, 2025	speech-recognitionSpeech Recognition	CodeCode Available	7	5
Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers	Jan 5, 2023	In-Context LearningLanguage Modeling	CodeCode Available	7	5
Seed-TTS: A Family of High-Quality Versatile Speech Generation Models	Jun 4, 2024	In-Context LearningLanguage Modelling	CodeCode Available	7	5
GLM-4-Voice: Towards Intelligent and Human-Like End-to-End Spoken Chatbot	Dec 3, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	7	5
PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit	May 20, 2022	AllAutomatic Speech Recognition (ASR)	CodeCode Available	6	5
ERNIE-SAT: Speech and Text Joint Pretraining for Cross-Lingual Multi-Speaker Text-to-Speech	Nov 7, 2022	Representation LearningSpeech Representation Learning	CodeCode Available	6	5
Better speech synthesis through scaling	May 12, 2023	Image GenerationSpeech Synthesis	CodeCode Available	6	5
SpeechGPT-Gen: Scaling Chain-of-Information Speech Generation	Jan 24, 2024	text-to-speechText to Speech	CodeCode Available	5	5
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models	Jun 13, 2023	Speech Synthesistext-to-speech	CodeCode Available	5	5
XPhoneBERT: A Pre-trained Multilingual Model for Phoneme Representations for Text-to-Speech	May 31, 2023	text-to-speechText to Speech	CodeCode Available	5	5
Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions	Jan 20, 2023	text-to-speechText to Speech	CodeCode Available	5	5
Enabling Auditory Large Language Models for Automatic Speech Quality Evaluation	Sep 25, 2024	text-to-speechText to Speech	CodeCode Available	5	5
Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language Modeling	Mar 7, 2023	In-Context LearningLanguage Modeling	CodeCode Available	5	5
Ming-Omni: A Unified Multimodal Model for Perception and Generation	Jun 11, 2025	Image Generationtext-to-speech	CodeCode Available	4	5
AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining	Aug 10, 2023	Audio GenerationIn-Context Learning	CodeCode Available	4	5

Show:10 25 50

← PrevPage 1 of 57Next →

No leaderboard results yet.