Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 651–700 of 1419 papers

Title	Date	Tasks	Status
Boosting Large Language Model for Speech Synthesis: An Empirical Study	Dec 30, 2023	Language ModelingLanguage Modelling	—Unverified
Normalization of Lithuanian Text Using Regular Expressions	Dec 29, 2023	Speech SynthesisText Normalization	—Unverified
AE-Flow: AutoEncoder Normalizing Flow	Dec 27, 2023	text-to-speechText to Speech	—Unverified
Creating New Voices using Normalizing Flows	Dec 22, 2023	Speech Synthesistext-to-speech	—Unverified
External Knowledge Augmented Polyphone Disambiguation Using Large Language Model	Dec 19, 2023	DecoderLanguage Modeling	—Unverified
A review-based study on different Text-to-Speech technologies	Dec 17, 2023	text-to-speechText to Speech	—Unverified
MM-TTS: Multi-modal Prompt based Style Transfer for Expressive Text-to-Speech Synthesis	Dec 17, 2023	Speech SynthesisStyle Transfer	—Unverified
An Experimental Study: Assessing the Combined Framework of WavLM and BEST-RQ for Text-to-Speech Synthesis	Dec 8, 2023	BenchmarkingQuantization	—Unverified
Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis	Dec 6, 2023	Speech Synthesistext-to-speech	—Unverified
Rapid Speaker Adaptation in Low Resource Text to Speech Systems using Synthetic Data and Transfer learning	Dec 2, 2023	Decodertext-to-speech	—Unverified
Code-Mixed Text to Speech Synthesis under Low-Resource Constraints	Dec 2, 2023	Speech Synthesistext-to-speech	—Unverified
Vulnerability of Automatic Identity Recognition to Audio-Visual Deepfakes	Nov 29, 2023	Face RecognitionFace Swapping	—Unverified
Guided Flows for Generative Modeling and Decision Making	Nov 22, 2023	Conditional Image GenerationDecision Making	—Unverified
Data Center Audio/Video Intelligence on Device (DAVID) -- An Edge-AI Platform for Smart-Toys	Nov 18, 2023	text-to-speechText to Speech	—Unverified
Utilizing Speech Emotion Recognition and Recommender Systems for Negative Emotion Handling in Therapy Chatbots	Nov 18, 2023	ChatbotEmotion Recognition	—Unverified
A Study on Altering the Latent Space of Pretrained Text to Speech Models for Improved Expressiveness	Nov 17, 2023	text-to-speechText to Speech	—Unverified
ChatAnything: Facetime Chat with LLM-Enhanced Personas	Nov 12, 2023	Image GenerationIn-Context Learning	—Unverified
Synthetic Speaking Children -- Why We Need Them and How to Make Them	Nov 8, 2023	text-to-speechText to Speech	—Unverified
Character-Level Bangla Text-to-IPA Transcription Using Transformer Architecture with Sequence Alignment	Nov 7, 2023	DecoderPosition	—Unverified
Transduce and Speak: Neural Transducer for Text-to-Speech with Semantic Token Prediction	Nov 6, 2023	text-to-speechText to Speech	—Unverified
E3 TTS: Easy End-to-End Diffusion-based Text to Speech	Nov 2, 2023	text-to-speechText to Speech	—Unverified
Expressive TTS Driven by Natural Language Prompts Using Few Human Annotations	Nov 2, 2023	Language ModelingLanguage Modelling	—Unverified
Style Description based Text-to-Speech with Conditional Prosodic Layer Normalization based Diffusion GAN	Oct 27, 2023	DecoderDenoising	—Unverified
Generative Pre-training for Speech with Flow Matching	Oct 25, 2023	Speech EnhancementSpeech Synthesis	—Unverified
DPP-TTS: Diversifying prosodic features of speech via determinantal point processes	Oct 23, 2023	DiversityPoint Processes	—Unverified
An overview of text-to-speech systems and media applications	Oct 22, 2023	Acoustic Modellingtext-to-speech	—Unverified
Attentive Multi-Layer Perceptron for Non-autoregressive Generation	Oct 14, 2023	Machine TranslationSpeech Synthesis	CodeCode Available
On the Relevance of Phoneme Duration Variability of Synthesized Training Data for Automatic Speech Recognition	Oct 12, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Prosody Analysis of Audiobooks	Oct 10, 2023	AttributeLanguage Modeling	CodeCode Available
Neutral TTS Female Voice Corpus in Brazilian Portuguese	Oct 8, 2023	Speech Synthesistext-to-speech	—Unverified
Unified speech and gesture synthesis using flow matching	Oct 8, 2023	Audio SynthesisMotion Synthesis	—Unverified
Comparative Analysis of Transfer Learning in Deep Learning Text-to-Speech Models on a Few-Shot, Low-Resource, Customized Dataset	Oct 8, 2023	text-to-speechText to Speech	—Unverified
Latent Filling: Latent Space Data Augmentation for Zero-shot Speech Synthesis	Oct 5, 2023	Data AugmentationSpeech Synthesis	—Unverified
The VoiceMOS Challenge 2023: Zero-shot Subjective Speech Quality Prediction for Multiple Domains	Oct 4, 2023	Speech Synthesistext-to-speech	—Unverified
Towards human-like spoken dialogue generation between AI agents from written dialogue	Oct 2, 2023	Dialogue Generationtext-to-speech	—Unverified
Low-Resource Self-Supervised Learning with SSL-Enhanced TTS	Sep 29, 2023	Self-Supervised Learningtext-to-speech	—Unverified
Synthetic Speech Detection Based on Temporal Consistency and Distribution of Speaker Features	Sep 29, 2023	Synthetic Speech Detectiontext-to-speech	—Unverified
High-Fidelity Speech Synthesis with Minimal Supervision: All Using Diffusion Models	Sep 27, 2023	AllSpeech Synthesis	—Unverified
Face-StyleSpeech: Enhancing Zero-shot Speech Synthesis from Face Images with Improved Face-to-Speech Mapping	Sep 25, 2023	Speech Synthesistext-to-speech	—Unverified
VoiceLDM: Text-to-Speech with Environmental Context	Sep 24, 2023	AudioCapstext-to-speech	—Unverified
DurIAN-E: Duration Informed Attention Network For Expressive Text-to-Speech Synthesis	Sep 22, 2023	DenoisingSpeech Synthesis	—Unverified
The Impact of Silence on Speech Anti-Spoofing	Sep 21, 2023	Action DetectionActivity Detection	—Unverified
Speak While You Think: Streaming Speech Synthesis During Text Generation	Sep 20, 2023	Speech SynthesisText Generation	—Unverified
Exploring Speech Enhancement for Low-resource Speech Synthesis	Sep 19, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Leveraging Speech PTM, Text LLM, and Emotional TTS for Speech Emotion Recognition	Sep 19, 2023	Data AugmentationEmotion Recognition	—Unverified
Augmenting text for spoken language understanding with Large Language Models	Sep 17, 2023	Semantic ParsingSpoken Language Understanding	—Unverified
PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-to-Speech Using Natural Language Descriptions	Sep 15, 2023	text-to-speechText to Speech	—Unverified
Cross-lingual Knowledge Distillation via Flow-based Voice Conversion for Robust Polyglot Text-To-Speech	Sep 15, 2023	Knowledge DistillationSpeech Synthesis	—Unverified
Direct Text to Speech Translation System using Acoustic Units	Sep 14, 2023	DecoderSpeech-to-Speech Translation	—Unverified
Cross-Utterance Conditioned VAE for Speech Generation	Sep 8, 2023	Speech Synthesistext-to-speech	—Unverified

Show:10 25 50

← PrevPage 14 of 29Next →

No leaderboard results yet.