SOTAVerified

Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Showing 151200 of 1419 papers

TitleStatusHype
Improving TTS for Shanghainese: Addressing Tone Sandhi via Word SegmentationCode1
ÌròyìnSpeech: A multi-purpose Yorùbá Speech CorpusCode1
SC VALL-E: Style-Controllable Zero-Shot Text to Speech SynthesizerCode1
Text + Sketch: Image Compression at Ultra Low RatesCode1
EmoSpeech: Guiding FastSpeech2 Towards Emotional Text to SpeechCode1
Towards Building Voice-based Conversational Recommender Systems: Datasets, Potential Solutions, and ProspectsCode1
ADAPTERMIX: Exploring the Efficacy of Mixture of Adapters for Low-Resource TTS AdaptationCode1
Stochastic Pitch Prediction Improves the Diversity and Naturalness of Speech in Glow-TTSCode1
An Efficient Membership Inference Attack for the Diffusion Model by Proximal InitializationCode1
Multilingual Text-to-Speech Synthesis for Turkic Languages Using TransliterationCode1
EfficientSpeech: An On-Device Text to Speech ModelCode1
EMNS /Imz/ Corpus: An emotive single-speaker dataset for narrative storytelling in games, television and graphic novelsCode1
Diffusion-Based Mel-Spectrogram Enhancement for Personalized Speech Synthesis with Found DataCode1
Parameter-Efficient Learning for Text-to-Speech Accent AdaptationCode1
Making More of Little Data: Improving Low-Resource Automatic Speech Recognition Using Data AugmentationCode1
Bts-e: Audio deepfake detection using breathing-talking-silence encoderCode1
Unsupervised Pre-Training For Data-Efficient Text-to-Speech On Low Resource LanguagesCode1
Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text RepresentationsCode1
Evaluating Parameter-Efficient Transfer Learning Approaches on SURE Benchmark for Speech UnderstandingCode1
Imaginary Voice: Face-styled Diffusion Model for Text-to-SpeechCode1
Learning to Speak from Text: Zero-Shot Multilingual Text-to-Speech with Unsupervised Text PretrainingCode1
ResGrad: Residual Denoising Diffusion Probabilistic Models for Text to SpeechCode1
StyleTTS-VC: One-Shot Voice Conversion by Knowledge Transfer from Style-Based TTS ModelsCode1
RWEN-TTS: Relation-aware Word Encoding Network for Natural Text-to-Speech SynthesisCode1
MnTTS2: An Open-Source Multi-Speaker Mongolian Text-to-Speech Synthesis DatasetCode1
BASPRO: a balanced script producer for speech corpus collection based on the genetic algorithmCode1
Learning to Dub Movies via Hierarchical Prosody ModelsCode1
SpeechLMScore: Evaluating speech generation using speech language modelCode1
OverFlow: Putting flows on top of neural transducers for better TTSCode1
Accented Text-to-Speech Synthesis with a Conditional Variational AutoencoderCode1
FCTalker: Fine and Coarse Grained Context Modeling for Expressive Conversational Speech SynthesisCode1
HiFi-WaveGAN: Generative Adversarial Network with Auxiliary Spectrogram-Phase Loss for High-Fidelity Singing Voice GenerationCode1
Towards Relation Extraction From SpeechCode1
Can we use Common Voice to train a Multi-Speaker TTS system?Code1
MnTTS: An Open-Source Mongolian Text-to-Speech Synthesis Dataset and Accompanied BaselineCode1
Visualising Model Training via Vowel Space for Text-To-Speech SystemsCode1
Dreamento: an open-source dream engineering toolbox for sleep EEG wearablesCode1
BibleTTS: a large, high-fidelity, multilingual, and uniquely African speech corpusCode1
Building African VoicesCode1
Automatic Prosody Annotation with Pre-Trained Text-Speech ModelCode1
Accurate Emotion Strength Assessment for Seen and Unseen Speech Based on Data-Driven Deep LearningCode1
Dict-TTS: Learning to Pronounce with Prior Dictionary Knowledge for Text-to-SpeechCode1
Cross-Utterance Conditioned VAE for Non-Autoregressive Text-to-SpeechCode1
A Character-level Span-based Model for Mandarin Prosodic Structure PredictionCode1
An End-to-end Chinese Text Normalization Model based on Rule-guided Flat-Lattice TransformerCode1
JETS: Jointly Training FastSpeech2 and HiFi-GAN for End to End Text to SpeechCode1
End to End Lip Synchronization with a Temporal AutoEncoderCode1
Unsupervised Text-to-Speech Synthesis by Unsupervised Automatic Speech RecognitionCode1
ESPnet-SLU: Advancing Spoken Language Understanding through ESPnetCode1
More than Words: In-the-Wild Visually-Driven Prosody for Text-to-SpeechCode1
Show:102550
← PrevPage 4 of 29Next →

No leaderboard results yet.