SOTAVerified

Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Showing 651700 of 1419 papers

TitleStatusHype
HLTCOE JHU Submission to the Voice Privacy Challenge 20240
Cross-Dialect Text-To-Speech in Pitch-Accent Language Incorporating Multi-Dialect Phoneme-Level BERT0
HMM-based data augmentation for E2E systems for building conversational speech synthesis systems0
Generative Data Augmentation Challenge: Zero-Shot Speech Synthesis for Personalized Speech Enhancement0
Creating New Voices using Normalizing Flows0
Attention-Constrained Inference for Robust Decoder-Only Text-to-Speech0
Human-in-the-loop Speaker Adaptation for DNN-based Multi-speaker TTS0
Huqariq: A Multilingual Speech Corpus of Native Languages of Peru for Speech Recognition0
Generative Adversarial Network based Speaker Adaptation for High Fidelity WaveNet Vocoder0
HybridNet: A Hybrid Neural Architecture to Speed-up Autoregressive Models0
Creating New Language and Voice Components for the Updated MaryTTS Text-to-Speech Synthesis Platform0
Generative adversarial network-based glottal waveform model for statistical parametric speech synthesis0
Creating an African American-Sounding TTS: Guidelines, Technical Challenges,and Surprising Evaluations0
Attempt Towards Stress Transfer in Speech-to-Speech Machine Translation0
Impact of Frame Rates on Speech Tokenizer: A Case Study on Mandarin and English0
Improve Cross-lingual Voice Cloning Using Low-quality Code-switched Data0
A Melody-Unsupervision Model for Singing Voice Synthesis0
Intelligibility of Text-to-Speech Systems for Mathematical Expressions0
Improve few-shot voice cloning using multi-modal learning0
Improving Accent Conversion with Reference Encoder and End-To-End Text-To-Speech0
Generating Synthetic Audio Data for Attention-Based Speech Recognition Systems0
Generating Rich Product Descriptions for Conversational E-commerce Systems0
Improving Code-Switching and Named Entity Recognition in ASR with Speech Editing based Data Augmentation0
Improving Contextual Recognition of Rare Words with an Alternate Spelling Prediction Model0
Improving Cross-lingual Speech Synthesis with Triplet Training Scheme0
Improving Deliberation by Text-Only and Semi-Supervised Training0
Learning Speech Representation From Contrastive Token-Acoustic Pretraining0
Improving Grapheme-to-Phoneme Conversion through In-Context Knowledge Retrieval with Large Language Models0
Generating Narrated Lecture Videos from Slides with Synchronized Highlights0
Improving Low Resource Code-switched ASR using Augmented Code-switched TTS0
Improving LPCNet-based Text-to-Speech with Linear Prediction-structured Mixture Density Network0
Improving Mandarin Prosodic Structure Prediction with Multi-level Contextual Information0
Improving multi-speaker TTS prosody variance with a residual encoder and normalizing flows0
Improving Noise Robustness of LLM-based Zero-shot TTS via Discrete Acoustic Token Denoising0
Improving Performance of End-to-End ASR on Numeric Sequences0
Improving prosodic phrasing of Vietnamese text-to-speech systems0
Improving Prosody Modelling with Cross-Utterance BERT Embeddings for End-to-end Speech Synthesis0
Improving Readability for Automatic Speech Recognition Transcription0
Generating Multilingual Voices Using Speaker Space Translation Based on Bilingual Speaker Data0
Improving Robustness of LLM-based Speech Synthesis by Learning Monotonic Alignment0
Improving Speech-to-Speech Translation Through Unlabeled Text0
Improving the expressiveness of neural vocoding with non-affine Normalizing Flows0
Improving the quality of neural TTS using long-form content and multi-speaker multi-style modeling0
A Transfer Learning End-to-End ArabicText-To-Speech (TTS) Deep Architecture0
Generating Multilingual Gender-Ambiguous Text-to-Speech Voices0
Incremental Disentanglement for Environment-Aware Zero-Shot Text-to-Speech Synthesis0
Incremental FastPitch: Chunk-based High Quality Text to Speech0
Incremental Machine Speech Chain Towards Enabling Listening while Speaking in Real-time0
From Start to Finish: Latency Reduction Strategies for Incremental Speech Synthesis in Simultaneous Speech-to-Speech Translation0
Generating diverse and natural text-to-speech samples using a quantized fine-grained VAE and auto-regressive prosody prior0
Show:102550
← PrevPage 14 of 29Next →

No leaderboard results yet.