SOTAVerified

Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Showing 251300 of 1419 papers

TitleStatusHype
Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment SearchCode1
ClariNet: Parallel Wave Generation in End-to-End Text-to-SpeechCode1
EmoSpeech: Guiding FastSpeech2 Towards Emotional Text to SpeechCode1
EfficientSpeech: An On-Device Text to Speech ModelCode1
Effective Deep Learning Models for Automatic Diacritization of Arabic TextCode1
E2 TTS: Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTSCode1
EditSpeech: A Text Based Speech Editing System Using Partial Inference and Bidirectional FusionCode1
Efficiently Trainable Text-to-Speech System Based on Deep Convolutional Networks with Guided AttentionCode1
DiffProsody: Diffusion-based Latent Prosody Generation for Expressive Speech Synthesis with Prosody Conditional Adversarial TrainingCode1
Diffusion-Based Mel-Spectrogram Enhancement for Personalized Speech Synthesis with Found DataCode1
Dict-TTS: Learning to Pronounce with Prior Dictionary Knowledge for Text-to-SpeechCode1
ArTST: Arabic Text and Speech TransformerCode1
Developing multilingual speech synthesis system for Ojibwe, Mi'kmaq, and MaliseetCode1
Deep Learning Based Assessment of Synthetic Speech NaturalnessCode1
EdiTTS: Score-based Editing for Controllable Text-to-SpeechCode1
Dreamento: an open-source dream engineering toolbox for sleep EEG wearablesCode1
Emotion-Aware Prosodic Phrasing for Expressive Text-to-SpeechCode1
Google Crowdsourced Speech Corpora and Related Open-Source Resources for Low-Resource Languages and Dialects: An OverviewCode1
EMNS /Imz/ Corpus: An emotive single-speaker dataset for narrative storytelling in games, television and graphic novelsCode1
Making More of Little Data: Improving Low-Resource Automatic Speech Recognition Using Data AugmentationCode1
QS-TTS: Towards Semi-Supervised Text-to-Speech Synthesis via Vector-Quantized Self-Supervised Speech Representation LearningCode1
Clip-TTS: Contrastive Text-content and Mel-spectrogram, A High-Quality Text-to-Speech Method based on Contextual Semantic Understanding0
ClArTTS: An Open-Source Classical Arabic Text-to-Speech Corpus0
ArmanTTS single-speaker Persian dataset0
CLaM-TTS: Improving Neural Codec Language Model for Zero-Shot Text-to-Speech0
A Review of Multi-Modal Large Language and Vision Models0
A Human-in-the-Loop Approach to Improving Cross-Text Prosody Transfer0
CHULA TTS: A Modularized Text-To-Speech Framework0
CHiVE: Varying Prosody in Speech Synthesis with a Linguistically Driven Dynamic Hierarchical Conditional Variational Network0
A Review of Deep Learning Techniques for Speech Processing0
ChatAnything: Facetime Chat with LLM-Enhanced Personas0
Character-Level Bangla Text-to-IPA Transcription Using Transformer Architecture with Sequence Alignment0
A review-based study on different Text-to-Speech technologies0
A Generative Model of a Pronunciation Lexicon for Hindi0
A Cost Efficient Approach to Correct OCR Errors in Large Document Collections0
Characteristic-Specific Partial Fine-Tuning for Efficient Emotion and Speaker Adaptation in Codec Language Text-to-Speech Models0
Chain-of-Thought Training for Open E2E Spoken Dialogue Systems0
A Fully Time-domain Neural Model for Subband-based Speech Synthesizer0
CASSANDRA: A multipurpose configurable voice-enabled human-computer-interface0
Arabic Text-To-Speech (TTS) Data Preparation0
A Bengali HMM Based Speech Synthesis System0
CapSpeech: Enabling Downstream Applications in Style-Captioned Text-to-Speech0
A Proposal of Automatic Error Correction in Text0
Can we steal your vocal identity from the Internet?: Initial investigation of cloning Obama's voice using GAN, WaveNet and low-quality found data0
Can we reconstruct a dysarthric voice with the large speech model Parler TTS?0
A Preliminary Analysis of Automatic Word and Syllable Prominence Detection in Non-Native Speech With Text-to-Speech Prosody Embeddings0
A Corpus of Neutral Voice Speech in Brazilian Portuguese0
Listening while Speaking and Visualizing: Improving ASR through Multimodal Chain0
CSSinger: End-to-End Chunkwise Streaming Singing Voice Synthesis System Based on Conditional Variational Autoencoder0
Can We Achieve High-quality Direct Speech-to-Speech Translation without Parallel Speech Data?0
Show:102550
← PrevPage 6 of 29Next →

No leaderboard results yet.