SOTAVerified

Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Showing 426450 of 1419 papers

TitleStatusHype
Aligner-Guided Training Paradigm: Advancing Text-to-Speech Models with Aligner Guided Duration0
EmoSpeech: A Corpus of Emotionally Rich and Contextually Detailed Speech Annotations0
DiffStyleTTS: Diffusion-based Hierarchical Prosody Modeling for Text-to-Speech with Diverse and Controllable Styles0
Text Is Not All You Need: Multimodal Prompting Helps LLMs Understand Humor0
SALMONN-omni: A Codec-free LLM for Full-duplex Speech Understanding and Generation0
Continual Learning in Machine Speech Chain Using Gradient Episodic Memory0
Visatronic: A Multimodal Decoder-Only Model for Speech Synthesis0
Hard-Synth: Synthesizing Diverse Hard Samples for ASR using Zero-Shot TTS and LLM0
A Context-Based Numerical Format Prediction for a Text-To-Speech System0
Leveraging Virtual Reality and AI Tutoring for Language Learning: A Case Study of a Virtual Campus Environment with OpenAI GPT Integration with Unity 3D0
Rethinking MUSHRA: Addressing Modern Challenges in Text-to-Speech Evaluation0
Improving Grapheme-to-Phoneme Conversion through In-Context Knowledge Retrieval with Large Language Models0
Debatts: Zero-Shot Debating Text-to-Speech Synthesis0
CUIfy the XR: An Open-Source Package to Embed LLM-powered Conversational Agents in XR0
Speech is More Than Words: Do Speech-to-Text Translation Systems Leverage Prosody?0
Robust and Unbounded Length Generalization in Autoregressive Transformer-Based Text-to-SpeechCode0
Fast and High-Quality Auto-Regressive Speech Synthesis via Speculative Decoding0
RDSinger: Reference-based Diffusion Network for Singing Voice Synthesis0
Asynchronous Tool Usage for Real-Time Agents0
Get Large Language Models Ready to Speak: A Late-fusion Approach for Speech Generation0
Making Social Platforms Accessible: Emotion-Aware Speech Generation with Integrated Text Analysis0
Evaluating and Improving Automatic Speech Recognition Systems for Korean Meteorological Experts0
ELAICHI: Enhancing Low-resource TTS by Addressing Infrequent and Low-frequency Character Bigrams0
Enhancing Low-Resource ASR through Versatile TTS: Bridging the Data Gap0
Continuous Speech Tokenizer in Text To SpeechCode0
Show:102550
← PrevPage 18 of 57Next →

No leaderboard results yet.