Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 651–700 of 1419 papers

Title	Date	Tasks	Status	Hype
A Survey on Audio Diffusion Models: Text To Speech Synthesis and Enhancement in Generative AI	Mar 23, 2023	Speech EnhancementSpeech Synthesis	—Unverified	0
Code-Switching Text Generation and Injection in Mandarin-English ASR	Mar 20, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
Cross-speaker Emotion Transfer by Manipulating Speech Style Latents	Mar 15, 2023	text-to-speechText to Speech	—Unverified	0
Controllable Prosody Generation With Partial Inputs	Mar 14, 2023	Speech Synthesistext-to-speech	—Unverified	0
QI-TTS: Questioning Intonation Control for Emotional Speech Synthesis	Mar 14, 2023	Emotional Speech SynthesisSentence	—Unverified	0
An End-to-End Neural Network for Image-to-Audio Transformation	Mar 10, 2023	Image to texttext-to-speech	—Unverified	0
Text-to-ECG: 12-Lead Electrocardiogram Synthesis conditioned on Clinical Text Reports	Mar 9, 2023	text-to-speechText to Speech	CodeCode Available	0
Do Prosody Transfer Models Transfer Prosody?	Mar 7, 2023	Speech Synthesistext-to-speech	—Unverified	0
Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language Modeling	Mar 7, 2023	In-Context LearningLanguage Modeling	CodeCode Available	5
FoundationTTS: Text-to-Speech for ASR Customization with Generative Language Model	Mar 6, 2023	Language ModelingLanguage Modelling	—Unverified	0
Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations	Mar 3, 2023	Speech DenoisingSpeech Enhancement	CodeCode Available	1
Evaluating Parameter-Efficient Transfer Learning Approaches on SURE Benchmark for Speech Understanding	Mar 2, 2023	Speech Synthesistext-to-speech	CodeCode Available	1
Fine-grained Emotional Control of Text-To-Speech: Learning To Rank Inter- And Intra-Class Emotion Intensities	Mar 2, 2023	Learning-To-Ranktext-to-speech	—Unverified	0
LiteG2P: A fast, light and high accuracy model for grapheme-to-phoneme conversion	Mar 2, 2023	Grapheme-to-Phoneme Conversionspeech-recognition	—Unverified	0
Leveraging Large Text Corpora for End-to-End Speech Summarization	Mar 2, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
ParrotTTS: Text-to-Speech synthesis by exploiting self-supervised representations	Mar 1, 2023	Self-Supervised LearningSpeech Synthesis	—Unverified	0
DTW-SiameseNet: Dynamic Time Warped Siamese Network for Mispronunciation Detection and Correction	Mar 1, 2023	Dynamic Time WarpingMetric Learning	—Unverified	0
ClArTTS: An Open-Source Classical Arabic Text-to-Speech Corpus	Feb 28, 2023	Speech Synthesistext-to-speech	—Unverified	0
Automatic Heteronym Resolution Pipeline Using RAD-TTS Aligners	Feb 28, 2023	text-to-speechText to Speech	—Unverified	0
CrossSpeech: Speaker-independent Acoustic Representation for Cross-lingual Speech Synthesis	Feb 28, 2023	Speech Synthesistext-to-speech	—Unverified	0
UniFLG: Unified Facial Landmark Generator from Text or Speech	Feb 28, 2023	DecoderFace Generation	—Unverified	0
Duration-aware pause insertion using pre-trained language model for multi-speaker text-to-speech	Feb 27, 2023	Language ModelingLanguage Modelling	—Unverified	0
Varianceflow: High-Quality and Controllable Text-to-Speech using Variance Information via Normalizing Flow	Feb 27, 2023	text-to-speechText to Speech	—Unverified	0
Imaginary Voice: Face-styled Diffusion Model for Text-to-Speech	Feb 27, 2023	Speech Synthesistext-to-speech	CodeCode Available	1
PITS: Variational Pitch Inference without Fundamental Frequency for End-to-End Pitch-controllable TTS	Feb 24, 2023	Decodertext-to-speech	CodeCode Available	2
Emphasizing Unseen Words: New Vocabulary Acquisition for End-to-End Speech Recognition	Feb 20, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
Fast and small footprint Hybrid HMM-HiFiGAN based system for speech synthesis in Indian languages	Feb 13, 2023	Speech Synthesistext-to-speech	—Unverified	0
A Vector Quantized Approach for Text to Speech Synthesis on Real-World Spontaneous Speech	Feb 8, 2023	Code GenerationDiversity	CodeCode Available	2
MAC: A unified framework boosting low resource automatic speech recognition	Feb 5, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
UzbekTagger: The rule-based POS tagger for Uzbek language	Jan 30, 2023	Language ModelingLanguage Modelling	—Unverified	0
Learning to Speak from Text: Zero-Shot Multilingual Text-to-Speech with Unsupervised Text Pretraining	Jan 30, 2023	Language ModelingLanguage Modelling	CodeCode Available	1
Time out of Mind: Generating Rate of Speech conditioned on emotion and speaker	Jan 29, 2023	Speech Synthesistext-to-speech	CodeCode Available	0
On granularity of prosodic representations in expressive text-to-speech	Jan 26, 2023	Expressive Speech SynthesisSpeech Synthesis	—Unverified	0
Unsupervised Data Selection for TTS: Using Arabic Broadcast News as a Case Study	Jan 22, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	0
Phoneme-Level BERT for Enhanced Prosody of Text-to-Speech with Grapheme Predictions	Jan 20, 2023	text-to-speechText to Speech	CodeCode Available	5
Modelling low-resource accents without accent-specific TTS frontend	Jan 11, 2023	text-to-speechText to Speech	—Unverified	0
UnifySpeech: A Unified Framework for Zero-shot Text-to-Speech and Voice Conversion	Jan 10, 2023	Quantizationtext-to-speech	—Unverified	0
Applying Automated Machine Translation to Educational Video Courses	Jan 9, 2023	Machine TranslationSpeech Synthesis	—Unverified	0
Using External Off-Policy Speech-To-Text Mappings in Contextual End-To-End Automated Speech Recognition	Jan 6, 2023	Domain AdaptationGPU	—Unverified	0
Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers	Jan 5, 2023	In-Context LearningLanguage Modeling	CodeCode Available	7
ReVISE: Self-Supervised Speech Resynthesis With Visual Input for Universal and Generalized Speech Regeneration	Jan 1, 2023	Audio-Visual Speech RecognitionResynthesis	—Unverified	0
ResGrad: Residual Denoising Diffusion Probabilistic Models for Text to Speech	Dec 30, 2022	Denoisingtext-to-speech	CodeCode Available	1
StyleTTS-VC: One-Shot Voice Conversion by Knowledge Transfer from Style-Based TTS Models	Dec 29, 2022	Data Augmentationtext-to-speech	CodeCode Available	1
HMM-based data augmentation for E2E systems for building conversational speech synthesis systems	Dec 22, 2022	Data AugmentationLanguage Modeling	—Unverified	0
ReVISE: Self-Supervised Speech Resynthesis with Visual Input for Universal and Generalized Speech Enhancement	Dec 21, 2022	Audio-Visual Speech RecognitionResynthesis	—Unverified	0
Improving the quality of neural TTS using long-form content and multi-speaker multi-style modeling	Dec 20, 2022	Formtext-to-speech	—Unverified	0
TTS-Guided Training for Accent Conversion Without Parallel Data	Dec 20, 2022	Decodertext-to-speech	—Unverified	0
Text-to-speech synthesis based on latent variable conversion using diffusion probabilistic model and variational autoencoder	Dec 16, 2022	Representation LearningSpeech Synthesis	—Unverified	0
Investigation of Japanese PnG BERT language model in text-to-speech synthesis for pitch accent language	Dec 16, 2022	Language ModelingLanguage Modelling	—Unverified	0
Speech Aware Dialog System Technology Challenge (DSTC11)	Dec 16, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0

Show:10 25 50

← PrevPage 14 of 29Next →

No leaderboard results yet.