SOTAVerified

Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Showing 401450 of 1419 papers

TitleStatusHype
DSE-TTS: Dual Speaker Embedding for Cross-Lingual Text-to-Speech0
DTW-SiameseNet: Dynamic Time Warped Siamese Network for Mispronunciation Detection and Correction0
An Implementation of Back-Propagation Learning on GF11, a Large SIMD Parallel Computer0
Dual Script E2E framework for Multilingual and Code-Switching ASR0
DualSpeech: Enhancing Speaker-Fidelity and Text-Intelligibility Through Dual Classifier-Free Guidance0
BERT, can HE predict contrastive focus? Predicting and controlling prominence in neural TTS using a language model0
DubWise: Video-Guided Speech Duration Control in Multimodal LLM-based Text-to-Speech for Dubbing0
Duration-aware pause insertion using pre-trained language model for multi-speaker text-to-speech0
Voice Impression Control in Zero-Shot TTS0
Enhancing Crowdsourced Audio for Text-to-Speech Models0
DurIAN-E: Duration Informed Attention Network For Expressive Text-to-Speech Synthesis0
Dynamic Prosody Generation for Speech Synthesis using Linguistics-Driven Acoustic Embedding Selection0
Enhancing Speech-to-Speech Translation with Multiple TTS Targets0
Ensemble prosody prediction for expressive speech synthesis0
ERVQ: Enhanced Residual Vector Quantization with Intra-and-Inter-Codebook Optimization for Neural Audio Codecs0
A Virtual Simulation-Pilot Agent for Training of Air Traffic Controllers0
Direct Speech to Speech Translation: A Review0
An Exploration of ECAPA-TDNN and x-vector Speaker Representations in Zero-shot Multi-speaker TTS0
Digital Einstein Experience: Fast Text-to-Speech for Conversational AI0
Effective Decoder Masking for Transformer Based End-to-End Speech Recognition0
DiffVoice: Text-to-Speech with Latent Diffusion0
Effectiveness of text to speech pseudo labels for forced alignment and cross lingual pretrained models for low resource speech recognition0
BOFFIN TTS: Few-Shot Speaker Adaptation by Bayesian Optimization0
Effect of choice of probability distribution, randomness, and search methods for alignment modeling in sequence-to-sequence text-to-speech synthesis using hard alignment0
Efficient data selection employing Semantic Similarity-based Graph Structures for model training0
Efficient Generative Modeling with Residual Vector Quantization-Based Tokens0
Efficient Incremental Text-to-Speech on GPUs0
AutoStyle-TTS: Retrieval-Augmented Generation based Automatic Style Matching Text-to-Speech Synthesis0
Efficiently Trained Low-Resource Mongolian Text-to-Speech System Based On FullConv-TTS0
An Expert System for Automatic Reading of A Text Written in Standard Arabic0
DiffStyleTTS: Diffusion-based Hierarchical Prosody Modeling for Text-to-Speech with Diverse and Controllable Styles0
Efficient training strategies for natural sounding speech synthesis and speaker adaptation based on FastPitch0
ELAICHI: Enhancing Low-resource TTS by Addressing Infrequent and Low-frequency Character Bigrams0
ELLA-V: Stable Neural Codec Language Modeling with Alignment-guided Sequence Reordering0
Auto Spell Suggestion for High Quality Speech Synthesis in Hindi0
BreezyVoice: Adapting TTS for Taiwanese Mandarin with Enhanced Polyphone Disambiguation -- Challenges and Insights0
ADEPT: A Dataset for Evaluating Prosody Transfer0
EmoCat: Language-agnostic Emotional Voice Conversion0
End-to-end speech recognition modeling from de-identified data0
Emo-DPO: Controllable Emotional Speech Synthesis through Direct Preference Optimization0
Autoregressive Speech Synthesis without Vector Quantization0
BTS: Back TranScription for Speech-to-Text Post-Processor using Text-to-Speech-to-Text0
An Experimental Study: Assessing the Combined Framework of WavLM and BEST-RQ for Text-to-Speech Synthesis0
AutoTTS: End-to-End Text-to-Speech Synthesis through Differentiable Duration Modeling0
Autoregressive Speech Synthesis with Next-Distribution Prediction0
An Exhaustive Evaluation of TTS- and VC-based Data Augmentation for ASR0
DiEmo-TTS: Disentangled Emotion Representations via Self-Supervised Distillation for Cross-Speaker Emotion Transfer in Text-to-Speech0
Emotional Dimension Control in Language Model-Based Text-to-Speech: Spanning a Broad Spectrum of Human Emotions0
Autoregressive Diffusion Transformer for Text-to-Speech Synthesis0
Diacritization of Maghrebi Arabic Sub-Dialects0
Show:102550
← PrevPage 9 of 29Next →

No leaderboard results yet.