SOTAVerified

Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Showing 201250 of 1419 papers

TitleStatusHype
Meta-TTS: Meta-Learning for Few-Shot Speaker Adaptive Text-to-SpeechCode1
FMFCC-A: A Challenging Mandarin Dataset for Synthetic Speech DetectionCode1
Fine-grained style control in Transformer-based Text-to-speech SynthesisCode1
Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised Training in Text-To-SpeechCode1
Mixer-TTS: non-autoregressive, fast and compact text-to-speech model conditioned on language model embeddingsCode1
EdiTTS: Score-based Editing for Controllable Text-to-SpeechCode1
Zero-Shot Text-to-Speech for Text-Based Insertion in Audio NarrationCode1
UR Channel-Robust Synthetic Speech Detection System for ASVspoof 2021Code1
StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice ConversionCode1
EditSpeech: A Text Based Speech Editing System Using Partial Inference and Bidirectional FusionCode1
FastPitchFormant: Source-filter based Decomposed Modeling for Speech SynthesisCode1
A Survey on Neural Speech SynthesisCode1
WaveGrad 2: Iterative Refinement for Text-to-Speech SynthesisCode1
RyanSpeech: A Corpus for Conversational Text-to-Speech SynthesisCode1
Enhancing Speaking Styles in Conversational Text-to-Speech Synthesis with Graph-based Multi-modal Context ModelingCode1
HUI-Audio-Corpus-German: A high quality TTS datasetCode1
Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech GenerationCode1
Grad-TTS: A Diffusion Probabilistic Model for Text-to-SpeechCode1
Wav2KWS: Transfer Learning from Speech Representations for Keyword SpottingCode1
Deep Learning Based Assessment of Synthetic Speech NaturalnessCode1
AdaSpeech 2: Adaptive Text to Speech with Untranscribed DataCode1
KazakhTTS: An Open-Source Kazakh Text-to-Speech Synthesis DatasetCode1
TalkNet 2: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis with Explicit Pitch and Duration PredictionCode1
Proteno: Text Normalization with Limited Data for Fast Deployment in Text to Speech SystemsCode1
A Toolbox for Construction and Analysis of Speech DatasetsCode1
SC-GlowTTS: an Efficient Zero-Shot Multi-Speaker Text-To-Speech ModelCode1
Limited Data Emotional Voice Conversion Leveraging Text-to-Speech: Two-stage Sequence-to-Sequence TrainingCode1
AdaSpeech: Adaptive Text to Speech for Custom VoiceCode1
LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture SearchCode1
Bidirectional Variational Inference for Non-Autoregressive Text-to-SpeechCode1
Unified Mandarin TTS Front-end Based on Distilled BERT ModelCode1
Semi-supervised URL Segmentation with Recurrent Neural Networks Pre-trained on Knowledge Graph EntitiesCode1
Universal MelGAN: A Robust Neural Vocoder for High-Fidelity Waveform Generation in Multiple DomainsCode1
Wave-Tacotron: Spectrogram-free end-to-end text-to-speech synthesisCode1
Semi-supervised URL Segmentation with Recurrent Neural NetworksPre-trained on Knowledge Graph EntitiesCode1
StyleMelGAN: An Efficient High-Fidelity Adversarial Vocoder with Temporal Adaptive NormalizationCode1
IESTAC: English-Italian Parallel Corpus for End-to-End Speech-to-Text Machine TranslationCode1
Effective Deep Learning Models for Automatic Diacritization of Arabic TextCode1
One-class learning towards generalized voice spoofing detectionCode1
Google Crowdsourced Speech Corpora and Related Open-Source Resources for Low-Resource Languages and Dialects: An OverviewCode1
Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis Including Unsupervised Duration ModelingCode1
Accent Estimation of Japanese Words from Their Surfaces and Romanizations for Building Large Vocabulary Accent DictionariesCode1
Enhancing Speech Intelligibility in Text-To-Speech Synthesis using Speaking Style ConversionCode1
Attentron: Few-Shot Text-to-Speech Utilizing Attention-Based Variable-Length EmbeddingCode1
Speaker Conditional WaveRNN: Towards Universal Neural Vocoder for Unseen Speaker and Recording ConditionsCode1
Pretraining Techniques for Sequence-to-Sequence Voice ConversionCode1
Phonological Features for 0-shot Multilingual Speech SynthesisCode1
One Model, Many Languages: Meta-learning for Multilingual Text-to-SpeechCode1
FastPitch: Parallel Text-to-speech with Pitch PredictionCode1
FastSpeech 2: Fast and High-Quality End-to-End Text to SpeechCode1
Show:102550
← PrevPage 5 of 29Next →

No leaderboard results yet.