Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 551–600 of 1419 papers

Title	Date	Tasks	Status	Hype
SeamlessM4T: Massively Multilingual & Multimodal Machine Translation	Aug 22, 2023	Automatic Speech RecognitionMachine Translation	CodeCode Available	2
Multi-GradSpeech: Towards Diffusion-based Multi-Speaker Text-to-speech Using Consistent Diffusion Models	Aug 21, 2023	text-to-speechText to Speech	—Unverified	0
AffectEcho: Speaker Independent and Language-Agnostic Emotion and Affect Transfer for Speech Synthesis	Aug 16, 2023	AttributeSpeech Synthesis	—Unverified	0
SpeechX: Neural Codec Language Model as a Versatile Speech Transformer	Aug 14, 2023	Language ModelingLanguage Modelling	—Unverified	0
Text-to-Video: a Two-stage Framework for Zero-shot Identity-agnostic Talking-head Generation	Aug 12, 2023	Talking Head Generationtext-to-speech	CodeCode Available	0
AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining	Aug 10, 2023	Audio GenerationIn-Context Learning	CodeCode Available	4
Towards an AI to Win Ghana's National Science and Maths Quiz	Aug 8, 2023	MathQuestion Answering	CodeCode Available	1
Let's Give a Voice to Conversational Agents in Virtual Reality	Aug 4, 2023	Speech-to-Texttext-to-speech	CodeCode Available	0
Textless Unit-to-Unit training for Many-to-Many Multilingual Speech-to-Speech Translation	Aug 3, 2023	DecoderQuantization	CodeCode Available	1
SALTTS: Leveraging Self-Supervised Speech Representations for improved Text-to-Speech Synthesis	Aug 2, 2023	DecoderSelf-Supervised Learning	—Unverified	0
Improving grapheme-to-phoneme conversion by learning pronunciations from speech recordings	Jul 31, 2023	Grapheme-to-Phoneme Conversionspeech-recognition	—Unverified	0
VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design	Jul 31, 2023	Computational Efficiencytext-to-speech	CodeCode Available	2
DiffProsody: Diffusion-based Latent Prosody Generation for Expressive Speech Synthesis with Prosody Conditional Adversarial Training	Jul 31, 2023	DenoisingExpressive Speech Synthesis	CodeCode Available	1
Multilingual context-based pronunciation learning for Text-to-Speech	Jul 31, 2023	text-to-speechText to Speech	—Unverified	0
Comparing normalizing flows and diffusion models for prosody and acoustic modelling in text-to-speech	Jul 31, 2023	Acoustic ModellingSpeech Synthesis	—Unverified	0
Improving TTS for Shanghainese: Addressing Tone Sandhi via Word Segmentation	Jul 30, 2023	text-to-speechText to Speech	CodeCode Available	1
METTS: Multilingual Emotional Text-to-Speech by Cross-speaker and Cross-lingual Emotion Transfer	Jul 29, 2023	DisentanglementDiversity	—Unverified	0
ÌròyìnSpeech: A multi-purpose Yorùbá Speech Corpus	Jul 29, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	1
Minimally-Supervised Speech Synthesis with Conditional Diffusion Model and Language Model: A Comparative Study of Semantic Coding	Jul 28, 2023	Language ModelingLanguage Modelling	—Unverified	0
SC VALL-E: Style-Controllable Zero-Shot Text to Speech Synthesizer	Jul 20, 2023	Expressive Speech SynthesisLanguage Modelling	CodeCode Available	1
SLMGAN: Exploiting Speech Language Model Representations for Unsupervised Zero-Shot Voice Conversion in GANs	Jul 18, 2023	Generative Adversarial NetworkLanguage Modeling	—Unverified	0
Mega-TTS 2: Boosting Prompting Mechanisms for Zero-Shot Speech Synthesis	Jul 14, 2023	In-Context LearningLanguage Modelling	—Unverified	0
Controllable Emphasis with zero data for text-to-speech	Jul 13, 2023	Sentencetext-to-speech	—Unverified	0
On the Use of Self-Supervised Speech Representations in Spontaneous Speech Synthesis	Jul 11, 2023	PredictionSelf-Supervised Learning	—Unverified	0
Artificial Eye for the Blind	Jul 7, 2023	Objectobject-detection	—Unverified	0
Text + Sketch: Image Compression at Ultra Low Rates	Jul 4, 2023	Image CompressionText to Speech	CodeCode Available	1
ContextSpeech: Expressive and Efficient Text-to-Speech for Paragraph Reading	Jul 3, 2023	FormSentence	—Unverified	0
High-Quality Automatic Voice Over with Accurate Alignment: Supervision through Self-Supervised Discrete Speech Units	Jun 29, 2023	Speech Synthesistext-to-speech	—Unverified	0
EmoSpeech: Guiding FastSpeech2 Towards Emotional Text to Speech	Jun 28, 2023	Emotion RecognitionSpeech Synthesis	CodeCode Available	1
GenerTTS: Pronunciation Disentanglement for Timbre and Style Generalization in Cross-Lingual Text-to-Speech	Jun 27, 2023	DisentanglementStyle Generalization	—Unverified	0
DSE-TTS: Dual Speaker Embedding for Cross-Lingual Text-to-Speech	Jun 25, 2023	Speech Synthesistext-to-speech	—Unverified	0
Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale	Jun 23, 2023	In-Context LearningSpeech Synthesis	CodeCode Available	0
Visual-Aware Text-to-Speech	Jun 21, 2023	RhythmSpeech Synthesis	—Unverified	0
Expressive Machine Dubbing Through Phrase-level Cross-lingual Prosody Transfer	Jun 20, 2023	text-to-speechText to Speech	—Unverified	0
Low-Resource Text-to-Speech Using Specific Data and Noise Augmentation	Jun 16, 2023	Data Augmentationtext-to-speech	—Unverified	0
CML-TTS A Multilingual Dataset for Speech Synthesis in Low-Resource Languages	Jun 16, 2023	Speech Synthesistext-to-speech	—Unverified	0
Towards Building Voice-based Conversational Recommender Systems: Datasets, Potential Solutions, and Prospects	Jun 14, 2023	Recommendation Systemstext-to-speech	CodeCode Available	1
Improving Code-Switching and Named Entity Recognition in ASR with Speech Editing based Data Augmentation	Jun 14, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models	Jun 13, 2023	Speech Synthesistext-to-speech	CodeCode Available	5
PauseSpeech: Natural Speech Synthesis via Pre-trained Language Model and Pause-based Prosody Modeling	Jun 13, 2023	Language ModelingLanguage Modelling	—Unverified	0
Learning Emotional Representations from Imbalanced Speech Data for Speech Emotion Recognition and Emotional Text-to-Speech	Jun 9, 2023	Emotion RecognitionSpeech Emotion Recognition	—Unverified	0
VIFS: An End-to-End Variational Inference for Foley Sound Synthesis	Jun 8, 2023	Speech Synthesistext-to-speech	CodeCode Available	0
Ada-TTA: Towards Adaptive High-Quality Text-to-Talking Avatar Synthesis	Jun 6, 2023	Neural Renderingtext-to-speech	—Unverified	0
Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive Bias	Jun 6, 2023	AttributeInductive Bias	—Unverified	0
Cross-Lingual Transfer Learning for Phrase Break Prediction with Multilingual Language Model	Jun 5, 2023	Cross-Lingual TransferLanguage Modeling	—Unverified	0
Latent Optimal Paths by Gumbel Propagation for Variational Bayesian Dynamic Programming	Jun 5, 2023	Bayesian InferenceSinging Voice Synthesis	CodeCode Available	0
Rhythm-controllable Attention with High Robustness for Long Sentence Speech Synthesis	Jun 5, 2023	RhythmSentence	—Unverified	0
Towards Robust FastSpeech 2 by Modelling Residual Multimodality	Jun 2, 2023	Decodertext-to-speech	—Unverified	0
The Effects of Input Type and Pronunciation Dictionary Usage in Transfer Learning for Low-Resource Text-to-Speech	Jun 1, 2023	Cross-Lingual Transfertext-to-speech	—Unverified	0
XPhoneBERT: A Pre-trained Multilingual Model for Phoneme Representations for Text-to-Speech	May 31, 2023	text-to-speechText to Speech	CodeCode Available	5

Show:10 25 50

← PrevPage 12 of 29Next →

No leaderboard results yet.