Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 101–125 of 1419 papers

Title	Date	Tasks	Status	Hype
Audio Deepfake Detection with Self-Supervised XLS-R and SLS Classifier	Oct 28, 2024	Audio Deepfake DetectionAudio Generation	CodeCode Available	2
CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations	Apr 10, 2024	Dialogue Generationtext-to-speech	CodeCode Available	2
Accelerating Diffusion-based Text-to-Speech Model Training with Dual Modality Alignment	May 26, 2025	text-to-speechText to Speech	CodeCode Available	2
TokenSynth: A Token-based Neural Synthesizer for Instrument Cloning and Text-to-Instrument	Feb 13, 2025	Audio GenerationDecoder	CodeCode Available	2
FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis	Apr 21, 2022	DenoisingGPU	CodeCode Available	2
EmoKnob: Enhance Voice Cloning with Fine-Grained Emotion Control	Oct 1, 2024	Emotional Speech SynthesisSpeech Synthesis	CodeCode Available	2
LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning	Jun 12, 2024	text-to-speechText to Speech	CodeCode Available	2
ÌròyìnSpeech: A multi-purpose Yorùbá Speech Corpus	Jul 29, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	1
JETS: Jointly Training FastSpeech2 and HiFi-GAN for End to End Text to Speech	Mar 31, 2022	text-to-speechText to Speech	CodeCode Available	1
Improving fairness for spoken language understanding in atypical speech with Text-to-Speech	Nov 16, 2023	Data AugmentationFairness	CodeCode Available	1
IESTAC: English-Italian Parallel Corpus for End-to-End Speech-to-Text Machine Translation	Nov 1, 2020	Dynamic Time WarpingMachine Translation	CodeCode Available	1
Improved Child Text-to-Speech Synthesis through Fastpitch-based Transfer Learning	Nov 7, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	1
KazakhTTS: An Open-Source Kazakh Text-to-Speech Synthesis Dataset	Apr 17, 2021	Speech Synthesistext-to-speech	CodeCode Available	1
HiFi-WaveGAN: Generative Adversarial Network with Auxiliary Spectrogram-Phase Loss for High-Fidelity Singing Voice Generation	Oct 23, 2022	Generative Adversarial NetworkSinging Voice Synthesis	CodeCode Available	1
HM-Conformer: A Conformer-based audio deepfake detection system with hierarchical pooling and multi-level classification token aggregation methods	Sep 15, 2023	Audio Deepfake DetectionDeepFake Detection	CodeCode Available	1
Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech	May 13, 2021	DecoderSpeech Synthesis	CodeCode Available	1
Google Crowdsourced Speech Corpora and Related Open-Source Resources for Low-Resource Languages and Dialects: An Overview	Oct 14, 2020	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	1
GUIRoboTron-Speech: Towards Automated GUI Agents Based on Speech Instructions	Jun 10, 2025	text-to-speechText to Speech	CodeCode Available	1
HUI-Audio-Corpus-German: A high quality TTS dataset	Jun 11, 2021	Text Normalizationtext-to-speech	CodeCode Available	1
Imaginary Voice: Face-styled Diffusion Model for Text-to-Speech	Feb 27, 2023	Speech Synthesistext-to-speech	CodeCode Available	1
g2pM: A Neural Grapheme-to-Phoneme Conversion Package for Mandarin Chinese Based on a New Open Benchmark Dataset	Apr 7, 2020	Grapheme-to-Phoneme ConversionPolyphone disambiguation	CodeCode Available	1
Improving TTS for Shanghainese: Addressing Tone Sandhi via Word Segmentation	Jul 30, 2023	text-to-speechText to Speech	CodeCode Available	1
In Other News: A Bi-style Text-to-speech Model for Synthesizing Newscaster Voice with Limited Data	Apr 4, 2019	Speech Synthesistext-to-speech	CodeCode Available	1
InstructTTSEval: Benchmarking Complex Natural-Language Instruction Following in Text-to-Speech Systems	Jun 19, 2025	BenchmarkingDescriptive	CodeCode Available	1
ALIF: Low-Cost Adversarial Audio Attacks on Black-Box Speech Platforms using Linguistic Features	Aug 3, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	1

Show:10 25 50

← PrevPage 5 of 57Next →

No leaderboard results yet.