SOTAVerified

Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Showing 201250 of 1419 papers

TitleStatusHype
Emotion-Aware Prosodic Phrasing for Expressive Text-to-SpeechCode1
AdaSpeech 2: Adaptive Text to Speech with Untranscribed DataCode1
Audio Jailbreak: An Open Comprehensive Benchmark for Jailbreaking Large Audio-Language ModelsCode1
PRESENT: Zero-Shot Text-to-Prosody ControlCode1
Pretraining Techniques for Sequence-to-Sequence Voice ConversionCode1
BibleTTS: a large, high-fidelity, multilingual, and uniquely African speech corpusCode1
Bidirectional Variational Inference for Non-Autoregressive Text-to-SpeechCode1
BiSinger: Bilingual Singing Voice SynthesisCode1
IESTAC: English-Italian Parallel Corpus for End-to-End Speech-to-Text Machine TranslationCode1
Imaginary Voice: Face-styled Diffusion Model for Text-to-SpeechCode1
EmoSpeech: Guiding FastSpeech2 Towards Emotional Text to SpeechCode1
EMNS /Imz/ Corpus: An emotive single-speaker dataset for narrative storytelling in games, television and graphic novelsCode1
Evaluating Speech Synthesis by Training Recognizers on Synthetic SpeechCode1
ResGrad: Residual Denoising Diffusion Probabilistic Models for Text to SpeechCode1
Efficiently Trainable Text-to-Speech System Based on Deep Convolutional Networks with Guided AttentionCode1
JETS: Jointly Training FastSpeech2 and HiFi-GAN for End to End Text to SpeechCode1
Effective Deep Learning Models for Automatic Diacritization of Arabic TextCode1
Attentron: Few-Shot Text-to-Speech Utilizing Attention-Based Variable-Length EmbeddingCode1
Attentive Sequence-to-Sequence Learning for Diacritic Restoration of Yorùbá Language TextCode1
KazakhTTS: An Open-Source Kazakh Text-to-Speech Synthesis DatasetCode1
Bts-e: Audio deepfake detection using breathing-talking-silence encoderCode1
KazEmoTTS: A Dataset for Kazakh Emotional Text-to-Speech SynthesisCode1
EdiTTS: Score-based Editing for Controllable Text-to-SpeechCode1
ALIF: Low-Cost Adversarial Audio Attacks on Black-Box Speech Platforms using Linguistic FeaturesCode1
Phonological Features for 0-shot Multilingual Speech SynthesisCode1
EfficientSpeech: An On-Device Text to Speech ModelCode1
ShiftySpeech: A Large-Scale Synthetic Speech Dataset with Distribution ShiftsCode1
Limited Data Emotional Voice Conversion Leveraging Text-to-Speech: Two-stage Sequence-to-Sequence TrainingCode1
E2 TTS: Embarrassingly Easy Fully Non-Autoregressive Zero-Shot TTSCode1
Attention model for articulatory features detectionCode1
ADAPTERMIX: Exploring the Efficacy of Mixture of Adapters for Low-Resource TTS AdaptationCode1
EditSpeech: A Text Based Speech Editing System Using Partial Inference and Bidirectional FusionCode1
MathReader : Text-to-Speech for Mathematical DocumentsCode1
TextrolSpeech: A Text Style Control Speech Corpus With Codec Language Text-to-Speech ModelsCode1
Dreamento: an open-source dream engineering toolbox for sleep EEG wearablesCode1
Making More of Little Data: Improving Low-Resource Automatic Speech Recognition Using Data AugmentationCode1
Diffusion-Based Mel-Spectrogram Enhancement for Personalized Speech Synthesis with Found DataCode1
Parameter-Efficient Learning for Text-to-Speech Accent AdaptationCode1
DiffProsody: Diffusion-based Latent Prosody Generation for Expressive Speech Synthesis with Prosody Conditional Adversarial TrainingCode1
Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text RepresentationsCode1
Dict-TTS: Learning to Pronounce with Prior Dictionary Knowledge for Text-to-SpeechCode1
Meta-TTS: Meta-Learning for Few-Shot Speaker Adaptive Text-to-SpeechCode1
Perception of prosodic variation for speech synthesis using an unsupervised discrete representation of F0Code1
Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis Including Unsupervised Duration ModelingCode1
One-class learning towards generalized voice spoofing detectionCode1
Mixer-TTS: non-autoregressive, fast and compact text-to-speech model conditioned on language model embeddingsCode1
A Survey on Neural Speech SynthesisCode1
MnTTS: An Open-Source Mongolian Text-to-Speech Synthesis Dataset and Accompanied BaselineCode1
Neural Text to Articulate Talk: Deep Text to Audiovisual Speech Synthesis achieving both Auditory and Photo-realismCode1
One Model, Many Languages: Meta-learning for Multilingual Text-to-SpeechCode1
Show:102550
← PrevPage 5 of 29Next →

No leaderboard results yet.