SOTAVerified|Agents Browse Leaderboard About Blog

Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 26–50 of 1419 papers

Title	Date	Tasks	Status	Hype	Score
AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining	Aug 10, 2023	Audio GenerationIn-Context Learning	CodeCode Available	4	5
Enhancing Suno's Bark Text-to-Speech Model: Addressing Limitations Through Meta's Encodec and Pre-Trained Hubert	Apr 18, 2023	Audio GenerationExpressive Speech Synthesis	CodeCode Available	4	5
VITA-Audio: Fast Interleaved Cross-Modal Token Generation for Efficient Large Speech-Language Model	May 6, 2025	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	4	5
Ming-Omni: A Unified Multimodal Model for Perception and Generation	Jun 11, 2025	Image Generationtext-to-speech	CodeCode Available	4	5
Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning	Jul 9, 2019	Speech Synthesistext-to-speech	CodeCode Available	3	5
Voila: Voice-Language Foundation Models for Real-Time Autonomous Interaction and Voice Role-Play	May 5, 2025	AI AgentAutomatic Speech Recognition	CodeCode Available	3	5
HierSpeech++: Bridging the Gap between Semantic and Acoustic Representation of Speech by Hierarchical Variational Inference for Zero-shot Speech Synthesis	Nov 21, 2023	Speech SynthesisSuper-Resolution	CodeCode Available	3	5
WavChat: A Survey of Spoken Dialogue Models	Nov 15, 2024	speech-recognitionSpeech Recognition	CodeCode Available	3	5
Towards Controllable Speech Synthesis in the Era of Large Language Models: A Survey	Dec 9, 2024	Speech SynthesisSurvey	CodeCode Available	3	5
ControlSpeech: Towards Simultaneous and Independent Zero-shot Speaker Cloning and Zero-shot Language Style Control	Jun 3, 2024	Speech Synthesistext-to-speech	CodeCode Available	3	5
SoundStream: An End-to-End Neural Audio Codec	Jul 7, 2021	CPUDecoder	CodeCode Available	3	5
UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation	Jun 15, 2021	Speech Synthesistext-to-speech	CodeCode Available	3	5
EmergentTTS-Eval: Evaluating TTS Models on Complex Prosodic, Expressiveness, and Linguistic Challenges Using Model-as-a-Judge	May 29, 2025	text-to-speechText to Speech	CodeCode Available	3	5
PeriodWave: Multi-Period Flow Matching for High-Fidelity Waveform Generation	Aug 14, 2024	Speech Synthesistext-to-speech	CodeCode Available	3	5
Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model	Aug 30, 2024	Audio CompressionAudio Generation	CodeCode Available	3	5
NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models	Mar 5, 2024	QuantizationSpeech Synthesis	CodeCode Available	3	5
MoonCast: High-Quality Zero-Shot Podcast Generation	Mar 18, 2025	Speech Synthesistext-to-speech	CodeCode Available	3	5
ProDiff: Progressive Fast Diffusion Model For High-Quality Text-to-Speech	Jul 13, 2022	DenoisingGPU	CodeCode Available	3	5
Lina-Speech: Gated Linear Attention is a Fast and Parameter-Efficient Learner for text-to-speech synthesis	Oct 30, 2024	Speech Synthesistext-to-speech	CodeCode Available	2	5
Llama-VITS: Enhancing TTS Synthesis with Semantic Awareness	Apr 10, 2024	Speech Synthesistext-to-speech	CodeCode Available	2	5
LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning	Jun 12, 2024	text-to-speechText to Speech	CodeCode Available	2	5
Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform	Oct 28, 2022	CPUKnowledge Distillation	CodeCode Available	2	5
LPCNet: Improving Neural Speech Synthesis Through Linear Prediction	Oct 28, 2018	PredictionSpeech Synthesis	CodeCode Available	2	5
IndicVoices-R: Unlocking a Massive Multilingual Multi-speaker Speech Corpus for Scaling Indian TTS	Sep 9, 2024	DenoisingSpeech Enhancement	CodeCode Available	2	5
iSTFTNet: Fast and Lightweight Mel-Spectrogram Vocoder Incorporating Inverse Short-Time Fourier Transform	Mar 4, 2022	Speech Synthesistext-to-speech	CodeCode Available	2	5

Show:10 25 50

← PrevPage 2 of 57Next →

No leaderboard results yet.