SOTAVerified

Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Showing 276300 of 1419 papers

TitleStatusHype
Cross-Dialect Text-To-Speech in Pitch-Accent Language Incorporating Multi-Dialect Phoneme-Level BERT0
SSR-Speech: Towards Stable, Safe and Robust Zero-shot Text-based Speech Editing and SynthesisCode2
D-CAPTCHA++: A Study of Resilience of Deepfake CAPTCHA under Transferable Imperceptible Adversarial Attack0
Zero-Shot Text-to-Speech as Golden Speech Generator: A Systematic Framework and its Applicability in Automatic Pronunciation Assessment0
Enhancing Kurdish Text-to-Speech with Native Corpus Training: A High-Quality WaveGlow Vocoder Approach0
VoiceWukong: Benchmarking Deepfake Voice Detection0
What happens to diffusion model likelihood when your model is conditional?0
AS-Speech: Adaptive Style For Speech Synthesis0
IndicVoices-R: Unlocking a Massive Multilingual Multi-speaker Speech Corpus for Scaling Indian TTSCode2
LAST: Language Model Aware Speech Tokenization0
Training Universal Vocoders with Feature Smoothing-Based Augmentation Methods for High-Quality TTS Systems0
VoxHakka: A Dialectally Diverse Multi-speaker Text-to-Speech System for Taiwanese Hakka0
A Framework for Synthetic Audio Conversations Generation using Large Language Models0
A multilingual training strategy for low resource Text to Speech0
Sample-Efficient Diffusion for Text-To-Speech SynthesisCode2
MaskGCT: Zero-Shot Text-to-Speech with Masked Generative Codec TransformerCode9
SelectTTS: Synthesizing Anyone's Voice via Discrete Unit-Based Frame Selection0
AASIST3: KAN-Enhanced AASIST Speech Deepfake Detection using SSL Features and Additional Regularization for the ASVspoof 2024 Challenge0
Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language ModelCode3
Multi-modal Adversarial Training for Zero-Shot Voice Cloning0
Easy, Interpretable, Effective: openSMILE for voice deepfake detection0
StyleSpeech: Parameter-efficient Fine Tuning for Pre-trained Controllable Text-to-SpeechCode0
DualSpeech: Enhancing Speaker-Fidelity and Text-Intelligibility Through Dual Classifier-Free Guidance0
SimpleSpeech 2: Towards Simple and Efficient Text-to-Speech with Flow-based Scalar Latent Transformer Diffusion Models0
Positional Description for Numerical Normalization0
Show:102550
← PrevPage 12 of 57Next →

No leaderboard results yet.