SOTAVerified

Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Showing 10511100 of 1419 papers

TitleStatusHype
Unsupervised TTS Acoustic Modeling for TTS with Conditional Disentangled Sequential VAE0
UzbekTagger: The rule-based POS tagger for Uzbek language0
VAKTA-SETU: A Speech-to-Speech Machine Translation Service in Select Indic Languages0
VALL-E 2: Neural Codec Language Models are Human Parity Zero-Shot Text to Speech Synthesizers0
VALL-E R: Robust and Efficient Zero-Shot Text-to-Speech Synthesis via Monotonic Alignment0
VALL-T: Decoder-Only Generative Transducer for Robust and Decoding-Controllable Text-to-Speech0
VARA-TTS: Non-Autoregressive Text-to-Speech Synthesis based on Very Deep VAE with Residual Attention0
可變速中文文字轉語音系統 (Variable Speech Rate Mandarin Chinese Text-to-Speech System) [In Chinese]0
Varianceflow: High-Quality and Controllable Text-to-Speech using Variance Information via Normalizing Flow0
VECL-TTS: Voice identity and Emotional style controllable Cross-Lingual Text-to-Speech0
Vers une annotation automatique de corpus audio pour la synth\`ese de parole (Towards Fully Automatic Annotation of Audio Books for Text-To-Speech (TTS) Synthesis) [in French]0
Vevo: Controllable Zero-Shot Voice Imitation with Self-Supervised Disentanglement0
ViDA-MAN: Visual Dialog with Digital Humans0
Vietnamese Text-To-Speech Shared Task VLSP 2020: Remaining problems with state-of-the-art techniques0
VioLA: Unified Codec Language Models for Speech Recognition, Synthesis, and Translation0
Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech0
Visatronic: A Multimodal Decoder-Only Model for Speech Synthesis0
Visual-Aware Text-to-Speech0
VisualSpeech: Enhance Prosody with Visual Context in TTS0
VisualTTS: TTS with Accurate Lip-Speech Synchronization for Automatic Voice Over0
ViT-TTS: Visual Text-to-Speech with Scalable Diffusion Transformer0
Vocal effort modeling in neural TTS for improving the intelligibility of synthetic speech in noise0
VocalEyes: Enhancing Environmental Perception for the Visually Impaired through Vision-Language Models and Distance-Aware Object Detection0
Voice-Assisted Real-Time Traffic Sign Recognition System Using Convolutional Neural Network0
Voice Builder: A Tool for Building Text-To-Speech Voices0
Voice Cloning: a Multi-Speaker Text-to-Speech Synthesis Approach based on Transfer Learning0
Voice Conversion by Cascading Automatic Speech Recognition and Text-to-Speech Synthesis with Prosody Transfer0
Voice Filter: Few-shot text-to-speech speaker adaptation using voice conversion as a post-processing module0
Voice Imitating Text-to-Speech Neural Networks0
VoiceLDM: Text-to-Speech with Environmental Context0
VoiceWukong: Benchmarking Deepfake Voice Detection0
Voicing Personas: Rewriting Persona Descriptions into Style Prompts for Controllable Text-to-Speech0
VoxHakka: A Dialectally Diverse Multi-speaker Text-to-Speech System for Taiwanese Hakka0
VQ-CTAP: Cross-Modal Fine-Grained Sequence Representation Learning for Speech Processing0
VQTTS: High-Fidelity Text-to-Speech Synthesis with Self-Supervised VQ Acoustic Feature0
VR-GPT: Visual Language Model for Intelligent Virtual Reality Applications0
Vulnerability of Automatic Identity Recognition to Audio-Visual Deepfakes0
Wasserstein GAN and Waveform Loss-based Acoustic Model Training for Multi-speaker Text-to-Speech Synthesis Systems Using a WaveNet Vocoder0
Waveform generation for text-to-speech synthesis using pitch-synchronous multi-scale generative adversarial networks0
WaveTTS: Tacotron-based TTS with Joint Time-Frequency Domain Loss0
Wave-U-Net Discriminator: Fast and Lightweight Discriminator for Generative Adversarial Network-Based Speech Synthesis0
WavThruVec: Latent speech representation as intermediate features for neural speech synthesis0
WCTC-Biasing: Retraining-free Contextual Biasing ASR with Wildcard CTC-based Keyword Spotting and Inter-layer Biasing0
Weakly-supervised text-to-speech alignment confidence measure0
Werewolf: A Straightforward Game Framework with TTS for Improved User Engagement0
What happens to diffusion model likelihood when your model is conditional?0
What the Future Brings: Investigating the Impact of Lookahead for Incremental Neural TTS0
What You Read Isn't What You Hear: Linguistic Sensitivity in Deepfake Speech Detection0
Whispered and Lombard Neural Speech Synthesis0
Why Do Speech Language Models Fail to Generate Semantically Coherent Outputs? A Modality Evolving Perspective0
Show:102550
← PrevPage 22 of 29Next →

No leaderboard results yet.