SOTAVerified

Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Showing 13511375 of 1419 papers

TitleStatusHype
Does Audio Deepfake Detection Generalize?0
Do Prosody Transfer Models Transfer Prosody?0
DPI-TTS: Directional Patch Interaction for Fast-Converging and Style Temporal Modeling in Text-to-Speech0
DPP-TTS: Diversifying prosodic features of speech via determinantal point processes0
DSE-TTS: Dual Speaker Embedding for Cross-Lingual Text-to-Speech0
DTW-SiameseNet: Dynamic Time Warped Siamese Network for Mispronunciation Detection and Correction0
Dual Audio-Centric Modality Coupling for Talking Head Generation0
Dual Script E2E framework for Multilingual and Code-Switching ASR0
DualSpeech: Enhancing Speaker-Fidelity and Text-Intelligibility Through Dual Classifier-Free Guidance0
Dual Supervised Learning0
DubWise: Video-Guided Speech Duration Control in Multimodal LLM-based Text-to-Speech for Dubbing0
Duration-aware pause insertion using pre-trained language model for multi-speaker text-to-speech0
DurIAN-E 2: Duration Informed Attention Network with Adaptive Variational Autoencoder and Adversarial Learning for Expressive Text-to-Speech Synthesis0
DurIAN-E: Duration Informed Attention Network For Expressive Text-to-Speech Synthesis0
Dynamic Prosody Generation for Speech Synthesis using Linguistics-Driven Acoustic Embedding Selection0
E1 TTS: Simple and Fast Non-Autoregressive TTS0
E3 TTS: Easy End-to-End Diffusion-based Text to Speech0
Easy, Interpretable, Effective: openSMILE for voice deepfake detection0
Effective Decoder Masking for Transformer Based End-to-End Speech Recognition0
Effectiveness of text to speech pseudo labels for forced alignment and cross lingual pretrained models for low resource speech recognition0
Effect of choice of probability distribution, randomness, and search methods for alignment modeling in sequence-to-sequence text-to-speech synthesis using hard alignment0
Efficient data selection employing Semantic Similarity-based Graph Structures for model training0
Efficient Generative Modeling with Residual Vector Quantization-Based Tokens0
Efficient Incremental Text-to-Speech on GPUs0
Efficiently Trained Low-Resource Mongolian Text-to-Speech System Based On FullConv-TTS0
Show:102550
← PrevPage 55 of 57Next →

No leaderboard results yet.