SOTAVerified

Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Showing 10011050 of 1419 papers

TitleStatusHype
A Survey on Neural Speech SynthesisCode1
FastPitchFormant: Source-filter based Decomposed Modeling for Speech SynthesisCode1
GANSpeech: Adversarial Training for High-Fidelity Multi-Speaker Speech Synthesis0
Non-Autoregressive TTS with Explicit Duration Modelling for Low-Resource Highly Expressive Speech0
Non-native English lexicon creation for bilingual speech synthesis0
Advances in Speech Vocoding for Text-to-Speech with Continuous Parameters0
WaveGrad 2: Iterative Refinement for Text-to-Speech SynthesisCode1
EMOVIE: A Mandarin Emotion Speech Dataset with a Simple Emotional Text-to-Speech Model0
Improving the expressiveness of neural vocoding with non-affine Normalizing Flows0
ADEPT: A Dataset for Evaluating Prosody Transfer0
Ctrl-P: Temporal Control of Prosodic Variation for Speech Synthesis0
RyanSpeech: A Corpus for Conversational Text-to-Speech SynthesisCode1
UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform GenerationCode3
A learned conditional prior for the VAE acoustic space of a TTS system0
SynthASR: Unlocking Synthetic Data for Speech Recognition0
HUI-Audio-Corpus-German: A high quality TTS datasetCode1
Enhancing Speaking Styles in Conversational Text-to-Speech Synthesis with Graph-based Multi-modal Context ModelingCode1
Improving multi-speaker TTS prosody variance with a residual encoder and normalizing flows0
Speech BERT Embedding For Improving Prosody in Neural TTS0
Data Augmentation Methods for End-to-end Speech Recognition on Distant-Talk Scenarios0
Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech GenerationCode1
Reinforce-Aligner: Reinforcement Alignment Search for Robust End-to-End Text-to-Speech0
An objective evaluation of the effects of recording conditions and speaker characteristics in multi-speaker deep neural speech synthesis0
Speaker verification-derived loss and data augmentation for DNN-based multispeaker speech synthesis0
Dual Script E2E framework for Multilingual and Code-Switching ASR0
A Corpus of Neutral Voice Speech in Brazilian Portuguese0
Grad-TTS: A Diffusion Probabilistic Model for Text-to-SpeechCode1
Wav2KWS: Transfer Learning from Speech Representations for Keyword SpottingCode1
Learning Robust Latent Representations for Controllable Speech Synthesis0
DiffSinger: Singing Voice Synthesis via Shallow Diffusion MechanismCode2
Talrómur: A large Icelandic TTS corpus0
On Addressing Practical Challenges for RNN-Transducer0
Phrase break prediction with bidirectional encoder representations in Japanese text-to-speech synthesisCode0
Deep Learning Based Assessment of Synthetic Speech NaturalnessCode1
AdaSpeech 2: Adaptive Text to Speech with Untranscribed DataCode1
KazakhTTS: An Open-Source Kazakh Text-to-Speech Synthesis DatasetCode1
TalkNet 2: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis with Explicit Pitch and Duration PredictionCode1
Proteno: Text Normalization with Limited Data for Fast Deployment in Text to Speech SystemsCode1
Non-autoregressive sequence-to-sequence voice conversion0
Enhancing Word-Level Semantic Representation via Dependency Structure for Expressive Text-to-Speech Synthesis0
Comparing the Benefit of Synthetic Training Data for Various Automatic Speech Recognition Architectures0
A Toolbox for Construction and Analysis of Speech DatasetsCode1
Exploring Machine Speech Chain for Domain Adaptation and Few-Shot Speaker Adaptation0
Flavored Tacotron: Conditional Learning for Prosodic-linguistic Features0
Grapheme-to-Phoneme Transformer Model for Transfer Learning Dialects0
AI4D -- African Language ProgramCode0
Hi-Fi Multi-Speaker English TTS Dataset0
Reinforcement Learning for Emotional Text-to-Speech Synthesis with Improved Emotion Discriminability0
Diff-TTS: A Denoising Diffusion Model for Text-to-Speech0
SC-GlowTTS: an Efficient Zero-Shot Multi-Speaker Text-To-Speech ModelCode1
Show:102550
← PrevPage 21 of 29Next →

No leaderboard results yet.