Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 451–500 of 1419 papers

Title	Date	Tasks	Status	Hype
VALL-T: Decoder-Only Generative Transducer for Robust and Decoding-Controllable Text-to-Speech	Jan 25, 2024	DecoderHallucination	—Unverified	0
SpeechGPT-Gen: Scaling Chain-of-Information Speech Generation	Jan 24, 2024	text-to-speechText to Speech	CodeCode Available	5
Maximizing Data Efficiency for Cross-Lingual TTS Adaptation by Self-Supervised Representation Mixing and Embedding Initialization	Jan 23, 2024	text-to-speechText to Speech	—Unverified	0
Adversarial speech for voice privacy protection from Personalized Speech generation	Jan 22, 2024	Speaker Verificationtext-to-speech	—Unverified	0
Empowering Communication: Speech Technology for Indian and Western Accents through AI-powered Speech Synthesis	Jan 22, 2024	Speaker VerificationSpeech Synthesis	—Unverified	0
Benchmarking Large Multimodal Models against Common Corruptions	Jan 22, 2024	BenchmarkingImage to text	CodeCode Available	1
Data-driven grapheme-to-phoneme representations for a lexicon-free text-to-speech	Jan 19, 2024	Self-Supervised Learningtext-to-speech	—Unverified	0
DurFlex-EVC: Duration-Flexible Emotional Voice Conversion Leveraging Discrete Representations without Text Alignment	Jan 16, 2024	DisentanglementSelf-Supervised Learning	CodeCode Available	2
MCMChaos: Improvising Rap Music with MCMC Methods and Chaos Theory	Jan 15, 2024	Music Generationtext-to-speech	—Unverified	0
ELLA-V: Stable Neural Codec Language Modeling with Alignment-guided Sequence Reordering	Jan 14, 2024	Audio GenerationLanguage Modeling	—Unverified	0
Multi-Task Learning for Front-End Text Processing in TTS	Jan 12, 2024	Language ModelingLanguage Modelling	CodeCode Available	1
End to end Hindi to English speech conversion using Bark, mBART and a finetuned XLSR Wav2Vec2	Jan 11, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
Noise-robust zero-shot text-to-speech synthesis conditioned on self-supervised speech-representation model with adapters	Jan 10, 2024	Self-Supervised LearningSpeech Enhancement	—Unverified	0
Transfer the linguistic representations from TTS to accent conversion with non-parallel data	Jan 7, 2024	text-to-speechText to Speech	—Unverified	0
Evaluating and Personalizing User-Perceived Quality of Text-to-Speech Voices for Delivering Mindfulness Meditation with Different Physical Embodiments	Jan 7, 2024	text-to-speechText to Speech	—Unverified	0
Incremental FastPitch: Chunk-based High Quality Text to Speech	Jan 3, 2024	Speech Synthesistext-to-speech	—Unverified	0
Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction	Jan 3, 2024	text-to-speechText to Speech	—Unverified	0
Boosting Large Language Model for Speech Synthesis: An Empirical Study	Dec 30, 2023	Language ModelingLanguage Modelling	—Unverified	0
Normalization of Lithuanian Text Using Regular Expressions	Dec 29, 2023	Speech SynthesisText Normalization	—Unverified	0
AE-Flow: AutoEncoder Normalizing Flow	Dec 27, 2023	text-to-speechText to Speech	—Unverified	0
Creating New Voices using Normalizing Flows	Dec 22, 2023	Speech Synthesistext-to-speech	—Unverified	0
External Knowledge Augmented Polyphone Disambiguation Using Large Language Model	Dec 19, 2023	DecoderLanguage Modeling	—Unverified	0
A review-based study on different Text-to-Speech technologies	Dec 17, 2023	text-to-speechText to Speech	—Unverified	0
MM-TTS: Multi-modal Prompt based Style Transfer for Expressive Text-to-Speech Synthesis	Dec 17, 2023	Speech SynthesisStyle Transfer	—Unverified	0
Neural Text to Articulate Talk: Deep Text to Audiovisual Speech Synthesis achieving both Auditory and Photo-realism	Dec 11, 2023	Face GenerationLip Reading	CodeCode Available	1
An Experimental Study: Assessing the Combined Framework of WavLM and BEST-RQ for Text-to-Speech Synthesis	Dec 8, 2023	BenchmarkingQuantization	—Unverified	0
Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis	Dec 6, 2023	Speech Synthesistext-to-speech	—Unverified	0
Rapid Speaker Adaptation in Low Resource Text to Speech Systems using Synthetic Data and Transfer learning	Dec 2, 2023	Decodertext-to-speech	—Unverified	0
Code-Mixed Text to Speech Synthesis under Low-Resource Constraints	Dec 2, 2023	Speech Synthesistext-to-speech	—Unverified	0
Vulnerability of Automatic Identity Recognition to Audio-Visual Deepfakes	Nov 29, 2023	Face RecognitionFace Swapping	—Unverified	0
Learning Arousal-Valence Representation from Categorical Emotion Labels of Speech	Nov 24, 2023	Dimensionality ReductionEmotion Classification	CodeCode Available	1
Guided Flows for Generative Modeling and Decision Making	Nov 22, 2023	Conditional Image GenerationDecision Making	—Unverified	0
HierSpeech++: Bridging the Gap between Semantic and Acoustic Representation of Speech by Hierarchical Variational Inference for Zero-shot Speech Synthesis	Nov 21, 2023	Speech SynthesisSuper-Resolution	CodeCode Available	3
Data Center Audio/Video Intelligence on Device (DAVID) -- An Edge-AI Platform for Smart-Toys	Nov 18, 2023	text-to-speechText to Speech	—Unverified	0
Utilizing Speech Emotion Recognition and Recommender Systems for Negative Emotion Handling in Therapy Chatbots	Nov 18, 2023	ChatbotEmotion Recognition	—Unverified	0
A Study on Altering the Latent Space of Pretrained Text to Speech Models for Improved Expressiveness	Nov 17, 2023	text-to-speechText to Speech	—Unverified	0
Improving fairness for spoken language understanding in atypical speech with Text-to-Speech	Nov 16, 2023	Data AugmentationFairness	CodeCode Available	1
ChatAnything: Facetime Chat with LLM-Enhanced Personas	Nov 12, 2023	Image GenerationIn-Context Learning	—Unverified	0
Synthetic Speaking Children -- Why We Need Them and How to Make Them	Nov 8, 2023	text-to-speechText to Speech	—Unverified	0
Character-Level Bangla Text-to-IPA Transcription Using Transformer Architecture with Sequence Alignment	Nov 7, 2023	DecoderPosition	—Unverified	0
Improved Child Text-to-Speech Synthesis through Fastpitch-based Transfer Learning	Nov 7, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	1
Transduce and Speak: Neural Transducer for Text-to-Speech with Semantic Token Prediction	Nov 6, 2023	text-to-speechText to Speech	—Unverified	0
E3 TTS: Easy End-to-End Diffusion-based Text to Speech	Nov 2, 2023	text-to-speechText to Speech	—Unverified	0
Expressive TTS Driven by Natural Language Prompts Using Few Human Annotations	Nov 2, 2023	Language ModelingLanguage Modelling	—Unverified	0
Style Description based Text-to-Speech with Conditional Prosodic Layer Normalization based Diffusion GAN	Oct 27, 2023	DecoderDenoising	—Unverified	0
Generative Pre-training for Speech with Flow Matching	Oct 25, 2023	Speech EnhancementSpeech Synthesis	—Unverified	0
ArTST: Arabic Text and Speech Transformer	Oct 25, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	1
DPP-TTS: Diversifying prosodic features of speech via determinantal point processes	Oct 23, 2023	DiversityPoint Processes	—Unverified	0
An overview of text-to-speech systems and media applications	Oct 22, 2023	Acoustic Modellingtext-to-speech	—Unverified	0
Generative Adversarial Training for Text-to-Speech Synthesis Based on Raw Phonetic Input and Explicit Prosody Modelling	Oct 14, 2023	Speech Synthesistext-to-speech	CodeCode Available	2

Show:10 25 50

← PrevPage 10 of 29Next →

No leaderboard results yet.