SOTAVerified

Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Showing 10011050 of 1419 papers

TitleStatusHype
Voice Cloning: a Multi-Speaker Text-to-Speech Synthesis Approach based on Transfer Learning0
Voice Conversion by Cascading Automatic Speech Recognition and Text-to-Speech Synthesis with Prosody Transfer0
Voice Filter: Few-shot text-to-speech speaker adaptation using voice conversion as a post-processing module0
Voice Imitating Text-to-Speech Neural Networks0
VoiceLDM: Text-to-Speech with Environmental Context0
VoiceWukong: Benchmarking Deepfake Voice Detection0
Voicing Personas: Rewriting Persona Descriptions into Style Prompts for Controllable Text-to-Speech0
VoxHakka: A Dialectally Diverse Multi-speaker Text-to-Speech System for Taiwanese Hakka0
VQ-CTAP: Cross-Modal Fine-Grained Sequence Representation Learning for Speech Processing0
VQTTS: High-Fidelity Text-to-Speech Synthesis with Self-Supervised VQ Acoustic Feature0
VR-GPT: Visual Language Model for Intelligent Virtual Reality Applications0
Vulnerability of Automatic Identity Recognition to Audio-Visual Deepfakes0
Wasserstein GAN and Waveform Loss-based Acoustic Model Training for Multi-speaker Text-to-Speech Synthesis Systems Using a WaveNet Vocoder0
Waveform generation for text-to-speech synthesis using pitch-synchronous multi-scale generative adversarial networks0
WaveTTS: Tacotron-based TTS with Joint Time-Frequency Domain Loss0
Wave-U-Net Discriminator: Fast and Lightweight Discriminator for Generative Adversarial Network-Based Speech Synthesis0
WavThruVec: Latent speech representation as intermediate features for neural speech synthesis0
WCTC-Biasing: Retraining-free Contextual Biasing ASR with Wildcard CTC-based Keyword Spotting and Inter-layer Biasing0
Weakly-supervised text-to-speech alignment confidence measure0
Werewolf: A Straightforward Game Framework with TTS for Improved User Engagement0
What happens to diffusion model likelihood when your model is conditional?0
What the Future Brings: Investigating the Impact of Lookahead for Incremental Neural TTS0
What You Read Isn't What You Hear: Linguistic Sensitivity in Deepfake Speech Detection0
Whispered and Lombard Neural Speech Synthesis0
Why Do Speech Language Models Fail to Generate Semantically Coherent Outputs? A Modality Evolving Perspective0
Word-wise intonation model for cross-language TTS systems0
You Do Not Need More Data: Improving End-To-End Speech Recognition by Text-To-Speech Data Augmentation0
Your voice is your voice: Supporting Self-expression through Speech Generation and LLMs in Augmented and Alternative Communication0
Zero-shot Cross-lingual Voice Transfer for TTS0
Zero-Shot Long-Form Voice Cloning with Dynamic Convolution Attention0
Zero-Shot Streaming Text to Speech Synthesis with Transducer and Auto-Regressive Modeling0
Zero-Shot Text-to-Speech as Golden Speech Generator: A Systematic Framework and its Applicability in Automatic Pronunciation Assessment0
Zero Shot Text to Speech Augmentation for Automatic Speech Recognition on Low-Resource Accented Speech Corpora0
Zero-Shot Text-to-Speech for Vietnamese0
Zero-shot text-to-speech synthesis conditioned using self-supervised speech representation model0
Zero-Shot vs. Few-Shot Multi-Speaker TTS Using Pre-trained Czech SpeechT5 Model0
ZET-Speech: Zero-shot adaptive Emotion-controllable Text-to-Speech Synthesis with Diffusion and Style-based Models0
Zipper: A Multi-Tower Decoder Architecture for Fusing Modalities0
The Zero Resource Speech Challenge 2019: TTS without T0
From Text to Sound: A Preliminary Study on Retrieving Sound Effects to Radio Stories0
On the Problem of Text-To-Speech Model Selection for Synthetic Data Generation in Automatic Speech Recognition0
Handling Numeric Expressions in Automatic Speech Recognition0
Bailing-TTS: Chinese Dialectal Speech Synthesis Towards Human-like Spontaneous Representation0
Enhancing Kurdish Text-to-Speech with Native Corpus Training: A High-Quality WaveGlow Vocoder Approach0
UDDETTS: Unifying Discrete and Dimensional Emotions for Controllable Emotional Text-to-Speech0
Audio Turing Test: Benchmarking the Human-likeness of Large Language Model-based Text-to-Speech Systems in Chinese0
Voice Impression Control in Zero-Shot TTS0
Scheduled Interleaved Speech-Text Training for Speech-to-Speech Translation with LLMs0
AASIST3: KAN-Enhanced AASIST Speech Deepfake Detection using SSL Features and Additional Regularization for the ASVspoof 2024 Challenge0
A Bengali HMM Based Speech Synthesis System0
Show:102550
← PrevPage 21 of 29Next →

No leaderboard results yet.