Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 701–750 of 1419 papers

Title	Date	Tasks	Status
Large-Scale Automatic Audiobook Creation	Sep 7, 2023	text-to-speechText to Speech	—Unverified
GRASS: Unified Generation Model for Speech-to-Semantic Tasks	Sep 6, 2023	named-entity-recognitionNamed Entity Recognition	—Unverified
MuLanTTS: The Microsoft Speech Synthesis System for Blizzard Challenge 2023	Sep 6, 2023	Speech Synthesistext-to-speech	—Unverified
PromptTTS 2: Describing and Generating Voices with Text Prompt	Sep 5, 2023	Language ModellingLarge Language Model	—Unverified
A Comparative Analysis of Pretrained Language Models for Text-to-Speech	Sep 4, 2023	Natural Language UnderstandingPrediction	—Unverified
The FruitShell French synthesis system at the Blizzard 2023 Challenge	Sep 1, 2023	Data AugmentationSpeech Synthesis	—Unverified
Learning Speech Representation From Contrastive Token-Acoustic Pretraining	Sep 1, 2023	Audio ClassificationAutomatic Speech Recognition	—Unverified
Improving Mandarin Prosodic Structure Prediction with Multi-level Contextual Information	Aug 31, 2023	DecoderMulti-Task Learning	—Unverified
Towards Spontaneous Style Modeling with Semi-supervised Pre-training for Conversational Text-to-Speech Synthesis	Aug 31, 2023	Expressive Speech SynthesisSentence	—Unverified
The DeepZen Speech Synthesis System for Blizzard Challenge 2023	Aug 30, 2023	SentenceSpeech Synthesis	—Unverified
Pruning Self-Attention for Zero-Shot Multi-Speaker Text-to-Speech	Aug 28, 2023	Domain Generalizationtext-to-speech	—Unverified
Rep2wav: Noise Robust text-to-speech Using self-supervised representations	Aug 28, 2023	Speech Enhancementtext-to-speech	—Unverified
Generalizable Zero-Shot Speaker Adaptive Speech Synthesis with Disentangled Representations	Aug 24, 2023	Representation LearningSpeech Synthesis	—Unverified
Multi-GradSpeech: Towards Diffusion-based Multi-Speaker Text-to-speech Using Consistent Diffusion Models	Aug 21, 2023	text-to-speechText to Speech	—Unverified
AffectEcho: Speaker Independent and Language-Agnostic Emotion and Affect Transfer for Speech Synthesis	Aug 16, 2023	AttributeSpeech Synthesis	—Unverified
SpeechX: Neural Codec Language Model as a Versatile Speech Transformer	Aug 14, 2023	Language ModelingLanguage Modelling	—Unverified
Text-to-Video: a Two-stage Framework for Zero-shot Identity-agnostic Talking-head Generation	Aug 12, 2023	Talking Head Generationtext-to-speech	CodeCode Available
Let's Give a Voice to Conversational Agents in Virtual Reality	Aug 4, 2023	Speech-to-Texttext-to-speech	CodeCode Available
SALTTS: Leveraging Self-Supervised Speech Representations for improved Text-to-Speech Synthesis	Aug 2, 2023	DecoderSelf-Supervised Learning	—Unverified
Comparing normalizing flows and diffusion models for prosody and acoustic modelling in text-to-speech	Jul 31, 2023	Acoustic ModellingSpeech Synthesis	—Unverified
Improving grapheme-to-phoneme conversion by learning pronunciations from speech recordings	Jul 31, 2023	Grapheme-to-Phoneme Conversionspeech-recognition	—Unverified
Multilingual context-based pronunciation learning for Text-to-Speech	Jul 31, 2023	text-to-speechText to Speech	—Unverified
METTS: Multilingual Emotional Text-to-Speech by Cross-speaker and Cross-lingual Emotion Transfer	Jul 29, 2023	DisentanglementDiversity	—Unverified
Minimally-Supervised Speech Synthesis with Conditional Diffusion Model and Language Model: A Comparative Study of Semantic Coding	Jul 28, 2023	Language ModelingLanguage Modelling	—Unverified
SLMGAN: Exploiting Speech Language Model Representations for Unsupervised Zero-Shot Voice Conversion in GANs	Jul 18, 2023	Generative Adversarial NetworkLanguage Modeling	—Unverified
Mega-TTS 2: Boosting Prompting Mechanisms for Zero-Shot Speech Synthesis	Jul 14, 2023	In-Context LearningLanguage Modelling	—Unverified
Controllable Emphasis with zero data for text-to-speech	Jul 13, 2023	Sentencetext-to-speech	—Unverified
On the Use of Self-Supervised Speech Representations in Spontaneous Speech Synthesis	Jul 11, 2023	PredictionSelf-Supervised Learning	—Unverified
Artificial Eye for the Blind	Jul 7, 2023	Objectobject-detection	—Unverified
ContextSpeech: Expressive and Efficient Text-to-Speech for Paragraph Reading	Jul 3, 2023	FormSentence	—Unverified
High-Quality Automatic Voice Over with Accurate Alignment: Supervision through Self-Supervised Discrete Speech Units	Jun 29, 2023	Speech Synthesistext-to-speech	—Unverified
GenerTTS: Pronunciation Disentanglement for Timbre and Style Generalization in Cross-Lingual Text-to-Speech	Jun 27, 2023	DisentanglementStyle Generalization	—Unverified
DSE-TTS: Dual Speaker Embedding for Cross-Lingual Text-to-Speech	Jun 25, 2023	Speech Synthesistext-to-speech	—Unverified
Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale	Jun 23, 2023	In-Context LearningSpeech Synthesis	CodeCode Available
Visual-Aware Text-to-Speech	Jun 21, 2023	RhythmSpeech Synthesis	—Unverified
Expressive Machine Dubbing Through Phrase-level Cross-lingual Prosody Transfer	Jun 20, 2023	text-to-speechText to Speech	—Unverified
Low-Resource Text-to-Speech Using Specific Data and Noise Augmentation	Jun 16, 2023	Data Augmentationtext-to-speech	—Unverified
CML-TTS A Multilingual Dataset for Speech Synthesis in Low-Resource Languages	Jun 16, 2023	Speech Synthesistext-to-speech	—Unverified
Improving Code-Switching and Named Entity Recognition in ASR with Speech Editing based Data Augmentation	Jun 14, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
PauseSpeech: Natural Speech Synthesis via Pre-trained Language Model and Pause-based Prosody Modeling	Jun 13, 2023	Language ModelingLanguage Modelling	—Unverified
Learning Emotional Representations from Imbalanced Speech Data for Speech Emotion Recognition and Emotional Text-to-Speech	Jun 9, 2023	Emotion RecognitionSpeech Emotion Recognition	—Unverified
VIFS: An End-to-End Variational Inference for Foley Sound Synthesis	Jun 8, 2023	Speech Synthesistext-to-speech	CodeCode Available
Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive Bias	Jun 6, 2023	AttributeInductive Bias	—Unverified
Ada-TTA: Towards Adaptive High-Quality Text-to-Talking Avatar Synthesis	Jun 6, 2023	Neural Renderingtext-to-speech	—Unverified
Cross-Lingual Transfer Learning for Phrase Break Prediction with Multilingual Language Model	Jun 5, 2023	Cross-Lingual TransferLanguage Modeling	—Unverified
Latent Optimal Paths by Gumbel Propagation for Variational Bayesian Dynamic Programming	Jun 5, 2023	Bayesian InferenceSinging Voice Synthesis	CodeCode Available
Rhythm-controllable Attention with High Robustness for Long Sentence Speech Synthesis	Jun 5, 2023	RhythmSentence	—Unverified
Towards Robust FastSpeech 2 by Modelling Residual Multimodality	Jun 2, 2023	Decodertext-to-speech	—Unverified
The Effects of Input Type and Pronunciation Dictionary Usage in Transfer Learning for Low-Resource Text-to-Speech	Jun 1, 2023	Cross-Lingual Transfertext-to-speech	—Unverified
Text-to-Speech Pipeline for Swiss German -- A comparison	May 31, 2023	Speech Synthesistext-to-speech	—Unverified

Show:10 25 50

← PrevPage 15 of 29Next →

No leaderboard results yet.