SOTAVerified

Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Showing 11011150 of 1419 papers

TitleStatusHype
Diff-TTS: A Denoising Diffusion Model for Text-to-Speech0
Hi-Fi Multi-Speaker English TTS Dataset0
Attention Forcing for Machine TranslationCode0
Fast DCTTS: Efficient Deep Convolutional Text-to-Speech0
Expressive Text-to-Speech using Style Tag0
Multi-rate attention architecture for fast streamable Text-to-speech spectrum modeling0
Continual Speaker Adaptation for Text-to-Speech Synthesis0
STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressive and Controllable Neural Text to Speech0
GAN Vocoder: Multi-Resolution Discriminator Is All You Need0
Investigating on Incorporating Pretrained and Learnable Speaker Representations for Multi-Speaker Multi-Style Text-to-Speech0
A Neural Text-to-Speech Model Utilizing Broadcast Data Mixed with Background Music0
Model architectures to extrapolate emotional expressions in DNN-based text-to-speech0
Alternate Endings: Improving Prosody for Incremental Neural TTS with Predicted Future Text Input0
AudioVisual Speech Synthesis: A brief literature review0
VARA-TTS: Non-Autoregressive Text-to-Speech Synthesis based on Very Deep VAE with Residual Attention0
Voice Cloning: a Multi-Speaker Text-to-Speech Synthesis Approach based on Transfer Learning0
Towards Natural and Controllable Cross-Lingual Voice Conversion Based on Neural TTS Model and Phonetic Posteriorgram0
Triple M: A Practical Text-to-speech Synthesis System With Multi-guidance Attention And Multi-band Multi-time LPCNet0
Expressive Neural Voice Cloning0
EmoCat: Language-agnostic Emotional Voice Conversion0
Generating coherent spontaneous speech and gesture from text0
Whispered and Lombard Neural Speech Synthesis0
Joint Audio-Visual Deepfake Detection0
Detection of Lexical Stress Errors in Non-Native (L2) English with Data Augmentation and Attention0
Parallel WaveNet conditioned on VAE latent vectors0
Denoising Text to Speech with Frame-Level Noise Modeling0
Syntactic representation learning for neural network based TTS with syntactic parse tree traversal0
Using previous acoustic context to improve Text-to-Speech synthesis0
MLS: A Large-Scale Multilingual Dataset for Speech ResearchCode0
Cross-Modal Generalization: Learning in Low Resource Modalities via Meta-AlignmentCode0
Text-to-speech for the hearing impaired0
GraphPB: Graphical Representations of Prosody Boundary in Speech Synthesis0
Vietnamese Text-To-Speech Shared Task VLSP 2020: Remaining problems with state-of-the-art techniques0
Improving prosodic phrasing of Vietnamese text-to-speech systems0
Development of Smartcall Vietnamese Text-to-Speech for VLSP 20200
Bootstrap an end-to-end ASR system by multilingual training, transfer learning, text-to-text mapping and synthetic audio0
FBWave: Efficient and Scalable Neural Vocoders for Streaming Text-To-Speech on the Edge0
Synth2Aug: Cross-domain speaker recognition with TTS synthesized speech0
Using Synthetic Audio to Improve The Recognition of Out-Of-Vocabulary Words in End-To-End ASR Systems0
Empirical Evaluation of Deep Learning Model Compression Techniques on the WaveNet VocoderCode0
Deep Shallow Fusion for RNN-T Personalization0
Hierarchical Prosody Modeling for Non-Autoregressive Speech Synthesis0
Using IPA-Based Tacotron for Data Efficient Cross-Lingual Speaker Adaptation and Pronunciation Enhancement0
Low-resource expressive text-to-speech using data augmentation0
Simultaneous Speech-to-Speech Translation System with Neural Incremental ASR, MT, and TTS0
Fine-grained Style Modeling, Transfer and Prediction in Text-to-Speech Synthesis via Phone-Level Content-Style Disentanglement0
Naturalization of Text by the Insertion of Pauses and Filler WordsCode0
Improving Prosody Modelling with Cross-Utterance BERT Embeddings for End-to-end Speech Synthesis0
Prosodic Representation Learning and Contextual Sampling for Neural Text-to-Speech0
Incremental Machine Speech Chain Towards Enabling Listening while Speaking in Real-time0
Show:102550
← PrevPage 23 of 29Next →

No leaderboard results yet.