Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1351–1400 of 1419 papers

Title	Date	Tasks	Status
Does Audio Deepfake Detection Generalize?	Mar 30, 2022	Audio Deepfake DetectionDeepFake Detection	—Unverified
Do Prosody Transfer Models Transfer Prosody?	Mar 7, 2023	Speech Synthesistext-to-speech	—Unverified
DPI-TTS: Directional Patch Interaction for Fast-Converging and Style Temporal Modeling in Text-to-Speech	Sep 18, 2024	text-to-speechText to Speech	—Unverified
DPP-TTS: Diversifying prosodic features of speech via determinantal point processes	Oct 23, 2023	DiversityPoint Processes	—Unverified
DSE-TTS: Dual Speaker Embedding for Cross-Lingual Text-to-Speech	Jun 25, 2023	Speech Synthesistext-to-speech	—Unverified
DTW-SiameseNet: Dynamic Time Warped Siamese Network for Mispronunciation Detection and Correction	Mar 1, 2023	Dynamic Time WarpingMetric Learning	—Unverified
Dual Audio-Centric Modality Coupling for Talking Head Generation	Mar 26, 2025	NeRFTalking Head Generation	—Unverified
Dual Script E2E framework for Multilingual and Code-Switching ASR	Jun 2, 2021	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
DualSpeech: Enhancing Speaker-Fidelity and Text-Intelligibility Through Dual Classifier-Free Guidance	Aug 26, 2024	Diversitytext-to-speech	—Unverified
Dual Supervised Learning	Jul 3, 2017	General Classificationimage-classification	—Unverified
DubWise: Video-Guided Speech Duration Control in Multimodal LLM-based Text-to-Speech for Dubbing	Jun 13, 2024	Language ModelingLanguage Modelling	—Unverified
Duration-aware pause insertion using pre-trained language model for multi-speaker text-to-speech	Feb 27, 2023	Language ModelingLanguage Modelling	—Unverified
DurIAN-E 2: Duration Informed Attention Network with Adaptive Variational Autoencoder and Adversarial Learning for Expressive Text-to-Speech Synthesis	Oct 17, 2024	Speech Synthesistext-to-speech	—Unverified
DurIAN-E: Duration Informed Attention Network For Expressive Text-to-Speech Synthesis	Sep 22, 2023	DenoisingSpeech Synthesis	—Unverified
Dynamic Prosody Generation for Speech Synthesis using Linguistics-Driven Acoustic Embedding Selection	Dec 2, 2019	Speech Synthesistext-to-speech	—Unverified
E1 TTS: Simple and Fast Non-Autoregressive TTS	Sep 14, 2024	Denoisingtext-to-speech	—Unverified
E3 TTS: Easy End-to-End Diffusion-based Text to Speech	Nov 2, 2023	text-to-speechText to Speech	—Unverified
Easy, Interpretable, Effective: openSMILE for voice deepfake detection	Aug 28, 2024	DeepFake DetectionFace Swapping	—Unverified
Effective Decoder Masking for Transformer Based End-to-End Speech Recognition	Oct 27, 2020	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Effectiveness of text to speech pseudo labels for forced alignment and cross lingual pretrained models for low resource speech recognition	Mar 31, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Effect of choice of probability distribution, randomness, and search methods for alignment modeling in sequence-to-sequence text-to-speech synthesis using hard alignment	Oct 28, 2019	Hard AttentionSpeech Synthesis	—Unverified
Efficient data selection employing Semantic Similarity-based Graph Structures for model training	Feb 22, 2024	Semantic SimilaritySemantic Textual Similarity	—Unverified
Efficient Generative Modeling with Residual Vector Quantization-Based Tokens	Dec 13, 2024	Conditional Image GenerationImage Generation	—Unverified
Efficient Incremental Text-to-Speech on GPUs	Nov 25, 2022	GPUSpeech Synthesis	—Unverified
Efficiently Trained Low-Resource Mongolian Text-to-Speech System Based On FullConv-TTS	Oct 24, 2022	Data AugmentationGPU	—Unverified
Efficient training strategies for natural sounding speech synthesis and speaker adaptation based on FastPitch	Oct 9, 2024	Speech Synthesistext-to-speech	—Unverified
ELAICHI: Enhancing Low-resource TTS by Addressing Infrequent and Low-frequency Character Bigrams	Oct 23, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
ELLA-V: Stable Neural Codec Language Modeling with Alignment-guided Sequence Reordering	Jan 14, 2024	Audio GenerationLanguage Modeling	—Unverified
EmoCat: Language-agnostic Emotional Voice Conversion	Jan 14, 2021	Decodertext-to-speech	—Unverified
EmoDiff: Intensity Controllable Emotional Text-to-Speech with Soft-Label Guidance	Nov 17, 2022	Denoisingtext-to-speech	—Unverified
Emo-DPO: Controllable Emotional Speech Synthesis through Direct Preference Optimization	Sep 16, 2024	Emotional Speech SynthesisIn-Context Learning	—Unverified
EmoSpeech: A Corpus of Emotionally Rich and Contextually Detailed Speech Annotations	Dec 9, 2024	text-to-speechText to Speech	—Unverified
EmoTalkingGaussian: Continuous Emotion-conditioned Talking Head Synthesis	Feb 2, 2025	Self-Supervised LearningSSIM	—Unverified
Emotional Dimension Control in Language Model-Based Text-to-Speech: Spanning a Broad Spectrum of Human Emotions	Sep 25, 2024	AttributeDimensionality Reduction	—Unverified
Emotional Prosody Control for Speech Generation	Nov 7, 2021	text-to-speechText to Speech	—Unverified
Emotion controllable speech synthesis using emotion-unlabeled dataset with the assistance of cross-domain speech emotion recognition	Oct 26, 2020	Emotion RecognitionSpeech Emotion Recognition	—Unverified
EMOVIE: A Mandarin Emotion Speech Dataset with a Simple Emotional Text-to-Speech Model	Jun 17, 2021	Emotional Speech SynthesisEmotion Classification	—Unverified
EmoVoice: LLM-based Emotional Text-To-Speech Model with Freestyle Text Prompting	Apr 17, 2025	text-to-speechText to Speech	—Unverified
Empathic Machines: Using Intermediate Features as Levers to Emulate Emotions in Text-To-Speech Systems	Jan 16, 2022	text-to-speechText to Speech	—Unverified
Empathic Machines: Using Intermediate Features as Levers to Emulate Emotions in Text-To-Speech Systems	Jul 1, 2022	text-to-speechText to Speech	—Unverified
Emphasis control for parallel neural TTS	Oct 6, 2021	Sentencetext-to-speech	—Unverified
Emphasized Accent Phrase Prediction from Text for Advertisement Text-To-Speech Synthesis	Dec 1, 2014	Speech Synthesistext-to-speech	—Unverified
Emphasizing Unseen Words: New Vocabulary Acquisition for End-to-End Speech Recognition	Feb 20, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Empowering Global Voices: A Data-Efficient, Phoneme-Tone Adaptive Approach to High-Fidelity Speech Synthesis	Apr 10, 2025	Speech Synthesistext-to-speech	—Unverified
EM-TTS: Efficiently Trained Low-Resource Mongolian Lightweight Text-to-Speech	Mar 13, 2024	GPUSpeech Synthesis	—Unverified
End-to-End Feedback Loss in Speech Chain Framework via Straight-Through Estimator	Oct 31, 2018	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
End to end Hindi to English speech conversion using Bark, mBART and a finetuned XLSR Wav2Vec2	Jan 11, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
End-to-end speech recognition modeling from de-identified data	Jul 12, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
End-to-End Text-to-Speech Based on Latent Representation of Speaking Styles Using Spontaneous Dialogue	Jun 24, 2022	text-to-speechText to Speech	—Unverified
End-to-end Text-to-speech for Low-resource Languages by Cross-Lingual Transfer Learning	Apr 13, 2019	Cross-Lingual Transfertext-to-speech	—Unverified

Show:10 25 50

← PrevPage 28 of 29Next →

No leaderboard results yet.