Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1051–1100 of 1419 papers

Title	Date	Tasks	Status
Unsupervised TTS Acoustic Modeling for TTS with Conditional Disentangled Sequential VAE	Jun 6, 2022	Representation LearningSpeech Representation Learning	—Unverified
UzbekTagger: The rule-based POS tagger for Uzbek language	Jan 30, 2023	Language ModelingLanguage Modelling	—Unverified
VAKTA-SETU: A Speech-to-Speech Machine Translation Service in Select Indic Languages	May 21, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
VALL-E 2: Neural Codec Language Models are Human Parity Zero-Shot Text to Speech Synthesizers	Jun 8, 2024	Speech Synthesistext-to-speech	—Unverified
VALL-E R: Robust and Efficient Zero-Shot Text-to-Speech Synthesis via Monotonic Alignment	Jun 12, 2024	QuantizationSpeech Synthesis	—Unverified
VALL-T: Decoder-Only Generative Transducer for Robust and Decoding-Controllable Text-to-Speech	Jan 25, 2024	DecoderHallucination	—Unverified
VARA-TTS: Non-Autoregressive Text-to-Speech Synthesis based on Very Deep VAE with Residual Attention	Feb 12, 2021	Speech Synthesistext-to-speech	—Unverified
可變速中文文字轉語音系統 (Variable Speech Rate Mandarin Chinese Text-to-Speech System) [In Chinese]	Mar 1, 2012	text-to-speechText to Speech	—Unverified
Varianceflow: High-Quality and Controllable Text-to-Speech using Variance Information via Normalizing Flow	Feb 27, 2023	text-to-speechText to Speech	—Unverified
VECL-TTS: Voice identity and Emotional style controllable Cross-Lingual Text-to-Speech	Jun 12, 2024	text-to-speechText to Speech	—Unverified
Vers une annotation automatique de corpus audio pour la synth\`ese de parole (Towards Fully Automatic Annotation of Audio Books for Text-To-Speech (TTS) Synthesis) [in French]	Jun 1, 2012	Speech Synthesistext-to-speech	—Unverified
Vevo: Controllable Zero-Shot Voice Imitation with Self-Supervised Disentanglement	Feb 11, 2025	Disentanglementtext-to-speech	—Unverified
ViDA-MAN: Visual Dialog with Digital Humans	Oct 26, 2021	speech-recognitionSpeech Recognition	—Unverified
Vietnamese Text-To-Speech Shared Task VLSP 2020: Remaining problems with state-of-the-art techniques	Dec 1, 2020	text-to-speechText to Speech	—Unverified
VioLA: Unified Codec Language Models for Speech Recognition, Synthesis, and Translation	May 25, 2023	DecoderLanguage Modeling	—Unverified
Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech	Oct 27, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Visatronic: A Multimodal Decoder-Only Model for Speech Synthesis	Nov 26, 2024	Decodermultimodal generation	—Unverified
Visual-Aware Text-to-Speech	Jun 21, 2023	RhythmSpeech Synthesis	—Unverified
VisualSpeech: Enhance Prosody with Visual Context in TTS	Jan 31, 2025	Prosody Predictiontext-to-speech	—Unverified
VisualTTS: TTS with Accurate Lip-Speech Synchronization for Automatic Voice Over	Oct 7, 2021	Speech Synthesistext-to-speech	—Unverified
ViT-TTS: Visual Text-to-Speech with Scalable Diffusion Transformer	May 22, 2023	DecoderDenoising	—Unverified
Vocal effort modeling in neural TTS for improving the intelligibility of synthetic speech in noise	Mar 20, 2022	text-to-speechText to Speech	—Unverified
VocalEyes: Enhancing Environmental Perception for the Visually Impaired through Vision-Language Models and Distance-Aware Object Detection	Mar 10, 2025	NVIDIA Jetson Orin Nanoobject-detection	—Unverified
Voice-Assisted Real-Time Traffic Sign Recognition System Using Convolutional Neural Network	Apr 11, 2024	Autonomous Vehiclestext-to-speech	—Unverified
Voice Builder: A Tool for Building Text-To-Speech Voices	May 1, 2018	text-to-speechText to Speech	—Unverified
Voice Cloning: a Multi-Speaker Text-to-Speech Synthesis Approach based on Transfer Learning	Feb 10, 2021	Speech Synthesistext-to-speech	—Unverified
Voice Conversion by Cascading Automatic Speech Recognition and Text-to-Speech Synthesis with Prosody Transfer	Sep 3, 2020	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Voice Filter: Few-shot text-to-speech speaker adaptation using voice conversion as a post-processing module	Feb 16, 2022	Speech Synthesistext-to-speech	—Unverified
Voice Imitating Text-to-Speech Neural Networks	Jun 4, 2018	Sentencetext-to-speech	—Unverified
VoiceLDM: Text-to-Speech with Environmental Context	Sep 24, 2023	AudioCapstext-to-speech	—Unverified
VoiceWukong: Benchmarking Deepfake Voice Detection	Sep 10, 2024	BenchmarkingFace Swapping	—Unverified
Voicing Personas: Rewriting Persona Descriptions into Style Prompts for Controllable Text-to-Speech	May 21, 2025	text-to-speechText to Speech	—Unverified
VoxHakka: A Dialectally Diverse Multi-speaker Text-to-Speech System for Taiwanese Hakka	Sep 3, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
VQ-CTAP: Cross-Modal Fine-Grained Sequence Representation Learning for Speech Processing	Aug 11, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
VQTTS: High-Fidelity Text-to-Speech Synthesis with Self-Supervised VQ Acoustic Feature	Apr 2, 2022	Speech Synthesistext-to-speech	—Unverified
VR-GPT: Visual Language Model for Intelligent Virtual Reality Applications	May 19, 2024	Language ModelingLanguage Modelling	—Unverified
Vulnerability of Automatic Identity Recognition to Audio-Visual Deepfakes	Nov 29, 2023	Face RecognitionFace Swapping	—Unverified
Wasserstein GAN and Waveform Loss-based Acoustic Model Training for Multi-speaker Text-to-Speech Synthesis Systems Using a WaveNet Vocoder	Jul 31, 2018	Generative Adversarial NetworkSpeech Synthesis	—Unverified
Waveform generation for text-to-speech synthesis using pitch-synchronous multi-scale generative adversarial networks	Oct 30, 2018	Image GenerationSpeech Synthesis	—Unverified
WaveTTS: Tacotron-based TTS with Joint Time-Frequency Domain Loss	Feb 2, 2020	text-to-speechText to Speech	—Unverified
Wave-U-Net Discriminator: Fast and Lightweight Discriminator for Generative Adversarial Network-Based Speech Synthesis	Mar 24, 2023	Generative Adversarial NetworkSpeech Synthesis	—Unverified
WavThruVec: Latent speech representation as intermediate features for neural speech synthesis	Mar 31, 2022	Speech Synthesistext-to-speech	—Unverified
WCTC-Biasing: Retraining-free Contextual Biasing ASR with Wildcard CTC-based Keyword Spotting and Inter-layer Biasing	Jun 2, 2025	Keyword Spottingspeech-recognition	—Unverified
Weakly-supervised text-to-speech alignment confidence measure	Dec 1, 2016	speech-recognitionSpeech Recognition	—Unverified
Werewolf: A Straightforward Game Framework with TTS for Improved User Engagement	May 30, 2025	text-to-speechText to Speech	—Unverified
What happens to diffusion model likelihood when your model is conditional?	Sep 10, 2024	domain classificationmodel	—Unverified
What the Future Brings: Investigating the Impact of Lookahead for Incremental Neural TTS	Sep 4, 2020	DecoderSentence	—Unverified
What You Read Isn't What You Hear: Linguistic Sensitivity in Deepfake Speech Detection	May 23, 2025	Face SwappingSensitivity	—Unverified
Whispered and Lombard Neural Speech Synthesis	Jan 13, 2021	Speaker VerificationSpeech Synthesis	—Unverified
Why Do Speech Language Models Fail to Generate Semantically Coherent Outputs? A Modality Evolving Perspective	Dec 22, 2024	text-to-speechText to Speech	—Unverified

Show:10 25 50

← PrevPage 22 of 29Next →

No leaderboard results yet.