SOTAVerified

Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Showing 376400 of 1419 papers

TitleStatusHype
XTTS: a Massively Multilingual Zero-Shot Text-to-Speech ModelCode1
A Human-in-the-Loop Approach to Improving Cross-Text Prosody Transfer0
Small-E: Small Language Model with Linear Attention for Efficient Speech SynthesisCode2
Total-Duration-Aware Duration Modeling for Text-to-Speech Systems0
Improving Audio Codec-based Zero-Shot Text-to-Speech Synthesis with Multi-Modal Context and Large Language Model0
Harder or Different? Understanding Generalization of Audio Deepfake Detection0
Style Mixture of Experts for Expressive Text-To-Speech Synthesis0
Task Arithmetic can Mitigate Synthetic-to-Real Gap in Automatic Speech Recognition0
Discrete Multimodal Transformers with a Pretrained Large Language Model for Mixed-Supervision Speech Processing0
Seed-TTS: A Family of High-Quality Versatile Speech Generation ModelsCode7
BiVocoder: A Bidirectional Neural Vocoder Integrating Feature Extraction and Waveform Generation0
Phonetic Enhanced Language Modeling for Text-to-Speech Synthesis0
ControlSpeech: Towards Simultaneous and Independent Zero-shot Speaker Cloning and Zero-shot Language Style ControlCode3
Accent Conversion in Text-To-Speech Using Multi-Level VAE and Adversarial Training0
Enhancing Zero-shot Text-to-Speech Synthesis with Human Feedback0
Zipper: A Multi-Tower Decoder Architecture for Fusing Modalities0
TransVIP: Speech to Speech Translation System with Voice and Isochrony PreservationCode2
Denoising LM: Pushing the Limits of Error Correction Models for Speech Recognition0
Multilingual Prosody Transfer: Comparing Supervised & Transfer Learning0
DLPO: Diffusion Model Loss-Guided Reinforcement Learning for Fine-Tuning Text-to-Speech Diffusion Models0
Multi-speaker Text-to-speech Training with Speaker Anonymized Data0
VR-GPT: Visual Language Model for Intelligent Virtual Reality Applications0
Exploring speech style spaces with language models: Emotional TTS without emotion labels0
Evaluating Text-to-Speech Synthesis from a Large Discrete Token-based Speech Language Model0
Building a Luganda Text-to-Speech Model From Crowdsourced Data0
Show:102550
← PrevPage 16 of 57Next →

No leaderboard results yet.