SOTAVerified

Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Showing 651700 of 1419 papers

TitleStatusHype
Boosting Large Language Model for Speech Synthesis: An Empirical Study0
Normalization of Lithuanian Text Using Regular Expressions0
AE-Flow: AutoEncoder Normalizing Flow0
Creating New Voices using Normalizing Flows0
External Knowledge Augmented Polyphone Disambiguation Using Large Language Model0
A review-based study on different Text-to-Speech technologies0
MM-TTS: Multi-modal Prompt based Style Transfer for Expressive Text-to-Speech Synthesis0
An Experimental Study: Assessing the Combined Framework of WavLM and BEST-RQ for Text-to-Speech Synthesis0
Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis0
Rapid Speaker Adaptation in Low Resource Text to Speech Systems using Synthetic Data and Transfer learning0
Code-Mixed Text to Speech Synthesis under Low-Resource Constraints0
Vulnerability of Automatic Identity Recognition to Audio-Visual Deepfakes0
Guided Flows for Generative Modeling and Decision Making0
Data Center Audio/Video Intelligence on Device (DAVID) -- An Edge-AI Platform for Smart-Toys0
Utilizing Speech Emotion Recognition and Recommender Systems for Negative Emotion Handling in Therapy Chatbots0
A Study on Altering the Latent Space of Pretrained Text to Speech Models for Improved Expressiveness0
ChatAnything: Facetime Chat with LLM-Enhanced Personas0
Synthetic Speaking Children -- Why We Need Them and How to Make Them0
Character-Level Bangla Text-to-IPA Transcription Using Transformer Architecture with Sequence Alignment0
Transduce and Speak: Neural Transducer for Text-to-Speech with Semantic Token Prediction0
E3 TTS: Easy End-to-End Diffusion-based Text to Speech0
Expressive TTS Driven by Natural Language Prompts Using Few Human Annotations0
Style Description based Text-to-Speech with Conditional Prosodic Layer Normalization based Diffusion GAN0
Generative Pre-training for Speech with Flow Matching0
DPP-TTS: Diversifying prosodic features of speech via determinantal point processes0
An overview of text-to-speech systems and media applications0
Attentive Multi-Layer Perceptron for Non-autoregressive GenerationCode0
On the Relevance of Phoneme Duration Variability of Synthesized Training Data for Automatic Speech Recognition0
Prosody Analysis of AudiobooksCode0
Neutral TTS Female Voice Corpus in Brazilian Portuguese0
Unified speech and gesture synthesis using flow matching0
Comparative Analysis of Transfer Learning in Deep Learning Text-to-Speech Models on a Few-Shot, Low-Resource, Customized Dataset0
Latent Filling: Latent Space Data Augmentation for Zero-shot Speech Synthesis0
The VoiceMOS Challenge 2023: Zero-shot Subjective Speech Quality Prediction for Multiple Domains0
Towards human-like spoken dialogue generation between AI agents from written dialogue0
Low-Resource Self-Supervised Learning with SSL-Enhanced TTS0
Synthetic Speech Detection Based on Temporal Consistency and Distribution of Speaker Features0
High-Fidelity Speech Synthesis with Minimal Supervision: All Using Diffusion Models0
Face-StyleSpeech: Enhancing Zero-shot Speech Synthesis from Face Images with Improved Face-to-Speech Mapping0
VoiceLDM: Text-to-Speech with Environmental Context0
DurIAN-E: Duration Informed Attention Network For Expressive Text-to-Speech Synthesis0
The Impact of Silence on Speech Anti-Spoofing0
Speak While You Think: Streaming Speech Synthesis During Text Generation0
Exploring Speech Enhancement for Low-resource Speech Synthesis0
Leveraging Speech PTM, Text LLM, and Emotional TTS for Speech Emotion Recognition0
Augmenting text for spoken language understanding with Large Language Models0
PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-to-Speech Using Natural Language Descriptions0
Cross-lingual Knowledge Distillation via Flow-based Voice Conversion for Robust Polyglot Text-To-Speech0
Direct Text to Speech Translation System using Acoustic Units0
Cross-Utterance Conditioned VAE for Speech Generation0
Show:102550
← PrevPage 14 of 29Next →

No leaderboard results yet.