SOTAVerified

Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Showing 226250 of 1419 papers

TitleStatusHype
DMOSpeech: Direct Metric Optimization via Distilled Diffusion Model in Zero-Shot Speech Synthesis0
Emphasis Rendering for Conversational Text-to-Speech with Multi-modal Multi-scale Context ModelingCode0
Unsupervised Data Validation Methods for Efficient Model Training0
Efficient training strategies for natural sounding speech synthesis and speaker adaptation based on FastPitch0
F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow MatchingCode11
Can DeepFake Speech be Reliably Detected?0
Bahasa Harmony: A Comprehensive Dataset for Bahasa Text-to-Speech Synthesis with Discrete Codec Modeling of EnGen-TTS0
SegINR: Segment-wise Implicit Neural Representation for Sequence Alignment in Neural Text-to-Speech0
HALL-E: Hierarchical Neural Codec Language Model for Minute-Long Zero-Shot Text-to-Speech Synthesis0
Where are we in audio deepfake detection? A systematic analysis over generative and detection modelsCode1
Adversarial Attacks and Robust Defenses in Speaker Embedding based Zero-Shot Text-to-Speech System0
Textless Streaming Speech-to-Speech Translation using Semantic Speech Tokens0
Generative Semantic Communication for Text-to-Speech Synthesis0
MultiVerse: Efficient and Expressive Zero-Shot Multi-Task Text-to-Speech0
Recent Advances in Speech Language Models: A SurveyCode2
EmoKnob: Enhance Voice Cloning with Fine-Grained Emotion ControlCode2
Augmentation through Laundering Attacks for Audio Spoof Detection0
Accent conversion using discrete units with parallel data synthesized from controllable accented TTS0
Word-wise intonation model for cross-language TTS systems0
FluentEditor2: Text-based Speech Editing by Modeling Multi-Scale Acoustic and Prosody ConsistencyCode0
Description-based Controllable Text-to-Speech with Cross-Lingual Voice Control0
Exploring synthetic data for cross-speaker style transfer in style representation based TTS0
Emotional Dimension Control in Language Model-Based Text-to-Speech: Spanning a Broad Spectrum of Human Emotions0
Enabling Auditory Large Language Models for Automatic Speech Quality EvaluationCode5
StyleFusion TTS: Multimodal Style-control and Enhanced Feature Fusion for Zero-shot Text-to-speech Synthesis0
Show:102550
← PrevPage 10 of 57Next →

No leaderboard results yet.