Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 151–200 of 1419 papers

Title	Date	Tasks	Status	Hype
Improving TTS for Shanghainese: Addressing Tone Sandhi via Word Segmentation	Jul 30, 2023	text-to-speechText to Speech	CodeCode Available	1
ÌròyìnSpeech: A multi-purpose Yorùbá Speech Corpus	Jul 29, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	1
SC VALL-E: Style-Controllable Zero-Shot Text to Speech Synthesizer	Jul 20, 2023	Expressive Speech SynthesisLanguage Modelling	CodeCode Available	1
Text + Sketch: Image Compression at Ultra Low Rates	Jul 4, 2023	Image CompressionText to Speech	CodeCode Available	1
EmoSpeech: Guiding FastSpeech2 Towards Emotional Text to Speech	Jun 28, 2023	Emotion RecognitionSpeech Synthesis	CodeCode Available	1
Towards Building Voice-based Conversational Recommender Systems: Datasets, Potential Solutions, and Prospects	Jun 14, 2023	Recommendation Systemstext-to-speech	CodeCode Available	1
ADAPTERMIX: Exploring the Efficacy of Mixture of Adapters for Low-Resource TTS Adaptation	May 29, 2023	Speech Synthesistext-to-speech	CodeCode Available	1
Stochastic Pitch Prediction Improves the Diversity and Naturalness of Speech in Glow-TTS	May 28, 2023	Diversitytext-to-speech	CodeCode Available	1
An Efficient Membership Inference Attack for the Diffusion Model by Proximal Initialization	May 26, 2023	Audio GenerationInference Attack	CodeCode Available	1
Multilingual Text-to-Speech Synthesis for Turkic Languages Using Transliteration	May 25, 2023	Speech Synthesistext-to-speech	CodeCode Available	1
EfficientSpeech: An On-Device Text to Speech Model	May 23, 2023	CPUmodel	CodeCode Available	1
EMNS /Imz/ Corpus: An emotive single-speaker dataset for narrative storytelling in games, television and graphic novels	May 22, 2023	Expressive Speech SynthesisSpeech Synthesis	CodeCode Available	1
Diffusion-Based Mel-Spectrogram Enhancement for Personalized Speech Synthesis with Found Data	May 18, 2023	Speech EnhancementSpeech Synthesis	CodeCode Available	1
Parameter-Efficient Learning for Text-to-Speech Accent Adaptation	May 18, 2023	Decodertext-to-speech	CodeCode Available	1
Making More of Little Data: Improving Low-Resource Automatic Speech Recognition Using Data Augmentation	May 18, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	1
Bts-e: Audio deepfake detection using breathing-talking-silence encoder	May 5, 2023	Audio Deepfake DetectionDeepFake Detection	CodeCode Available	1
Unsupervised Pre-Training For Data-Efficient Text-to-Speech On Low Resource Languages	Mar 28, 2023	Data Augmentationtext-to-speech	CodeCode Available	1
Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations	Mar 3, 2023	Speech DenoisingSpeech Enhancement	CodeCode Available	1
Evaluating Parameter-Efficient Transfer Learning Approaches on SURE Benchmark for Speech Understanding	Mar 2, 2023	Speech Synthesistext-to-speech	CodeCode Available	1
Imaginary Voice: Face-styled Diffusion Model for Text-to-Speech	Feb 27, 2023	Speech Synthesistext-to-speech	CodeCode Available	1
Learning to Speak from Text: Zero-Shot Multilingual Text-to-Speech with Unsupervised Text Pretraining	Jan 30, 2023	Language ModelingLanguage Modelling	CodeCode Available	1
ResGrad: Residual Denoising Diffusion Probabilistic Models for Text to Speech	Dec 30, 2022	Denoisingtext-to-speech	CodeCode Available	1
StyleTTS-VC: One-Shot Voice Conversion by Knowledge Transfer from Style-Based TTS Models	Dec 29, 2022	Data Augmentationtext-to-speech	CodeCode Available	1
RWEN-TTS: Relation-aware Word Encoding Network for Natural Text-to-Speech Synthesis	Dec 15, 2022	RelationSpeech Synthesis	CodeCode Available	1
MnTTS2: An Open-Source Multi-Speaker Mongolian Text-to-Speech Synthesis Dataset	Dec 11, 2022	Speech Synthesistext-to-speech	CodeCode Available	1
BASPRO: a balanced script producer for speech corpus collection based on the genetic algorithm	Dec 11, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	1
Learning to Dub Movies via Hierarchical Prosody Models	Dec 8, 2022	text-to-speechText to Speech	CodeCode Available	1
SpeechLMScore: Evaluating speech generation using speech language model	Dec 8, 2022	Language ModelingLanguage Modelling	CodeCode Available	1
OverFlow: Putting flows on top of neural transducers for better TTS	Nov 13, 2022	Normalising FlowsSpeech Synthesis	CodeCode Available	1
Accented Text-to-Speech Synthesis with a Conditional Variational Autoencoder	Nov 7, 2022	Speech Synthesistext-to-speech	CodeCode Available	1
FCTalker: Fine and Coarse Grained Context Modeling for Expressive Conversational Speech Synthesis	Oct 27, 2022	Speech Synthesistext-to-speech	CodeCode Available	1
HiFi-WaveGAN: Generative Adversarial Network with Auxiliary Spectrogram-Phase Loss for High-Fidelity Singing Voice Generation	Oct 23, 2022	Generative Adversarial NetworkSinging Voice Synthesis	CodeCode Available	1
Towards Relation Extraction From Speech	Oct 17, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	1
Can we use Common Voice to train a Multi-Speaker TTS system?	Oct 12, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	1
MnTTS: An Open-Source Mongolian Text-to-Speech Synthesis Dataset and Accompanied Baseline	Sep 22, 2022	Speech Synthesistext-to-speech	CodeCode Available	1
Visualising Model Training via Vowel Space for Text-To-Speech Systems	Aug 21, 2022	Speech Synthesistext-to-speech	CodeCode Available	1
Dreamento: an open-source dream engineering toolbox for sleep EEG wearables	Jul 8, 2022	EEGElectroencephalogram (EEG)	CodeCode Available	1
BibleTTS: a large, high-fidelity, multilingual, and uniquely African speech corpus	Jul 7, 2022	text-to-speechText to Speech	CodeCode Available	1
Building African Voices	Jul 1, 2022	Speech Synthesistext-to-speech	CodeCode Available	1
Automatic Prosody Annotation with Pre-Trained Text-Speech Model	Jun 16, 2022	Speech Synthesistext-to-speech	CodeCode Available	1
Accurate Emotion Strength Assessment for Seen and Unseen Speech Based on Data-Driven Deep Learning	Jun 15, 2022	AttributeEmotion Classification	CodeCode Available	1
Dict-TTS: Learning to Pronounce with Prior Dictionary Knowledge for Text-to-Speech	Jun 5, 2022	Polyphone disambiguationtext-to-speech	CodeCode Available	1
Cross-Utterance Conditioned VAE for Non-Autoregressive Text-to-Speech	May 9, 2022	Diversitytext-to-speech	CodeCode Available	1
A Character-level Span-based Model for Mandarin Prosodic Structure Prediction	Mar 31, 2022	Sentencetext-to-speech	CodeCode Available	1
An End-to-end Chinese Text Normalization Model based on Rule-guided Flat-Lattice Transformer	Mar 31, 2022	Text Normalizationtext-to-speech	CodeCode Available	1
JETS: Jointly Training FastSpeech2 and HiFi-GAN for End to End Text to Speech	Mar 31, 2022	text-to-speechText to Speech	CodeCode Available	1
End to End Lip Synchronization with a Temporal AutoEncoder	Mar 30, 2022	text-to-speechText to Speech	CodeCode Available	1
Unsupervised Text-to-Speech Synthesis by Unsupervised Automatic Speech Recognition	Mar 29, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	1
ESPnet-SLU: Advancing Spoken Language Understanding through ESPnet	Nov 29, 2021	Spoken Language Understandingtext-to-speech	CodeCode Available	1
More than Words: In-the-Wild Visually-Driven Prosody for Text-to-Speech	Nov 19, 2021	text-to-speechText to Speech	CodeCode Available	1

Show:10 25 50

← PrevPage 4 of 29Next →

No leaderboard results yet.