Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 251–300 of 1419 papers

Title	Date	Tasks	Status	Hype
MultiSpeech: Multi-Speaker Text to Speech with Transformer	Jun 8, 2020	Decodertext-to-speech	CodeCode Available	1
End-to-End Adversarial Text-to-Speech	Jun 5, 2020	Adversarial TextDynamic Time Warping	CodeCode Available	1
Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search	May 22, 2020	text-to-speechText to Speech	CodeCode Available	1
Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis	May 12, 2020	Speech SynthesisStyle Transfer	CodeCode Available	1
From Speaker Verification to Multispeaker Speech Synthesis, Deep Transfer with Feedback Constraint	May 10, 2020	Speaker VerificationSpeech Synthesis	CodeCode Available	1
Transformer based Grapheme-to-Phoneme Conversion	Apr 14, 2020	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	1
g2pM: A Neural Grapheme-to-Phoneme Conversion Package for Mandarin Chinese Based on a New Open Benchmark Dataset	Apr 7, 2020	Grapheme-to-Phoneme ConversionPolyphone disambiguation	CodeCode Available	1
Perception of prosodic variation for speech synthesis using an unsupervised discrete representation of F0	Mar 14, 2020	ClusteringRepresentation Learning	CodeCode Available	1
Semi-Supervised Neural Architecture Search	Feb 24, 2020	GPUNatural Language Transduction	CodeCode Available	1
Voice Transformer Network: Sequence-to-Sequence Voice Conversion Using Transformer with Text-to-Speech Pretraining	Dec 14, 2019	text-to-speechText to Speech	CodeCode Available	1
Attention model for articulatory features detection	Jul 2, 2019	Manner Of Articulation Detectionmodel	CodeCode Available	1
In Other News: A Bi-style Text-to-speech Model for Synthesizing Newscaster Voice with Limited Data	Apr 4, 2019	Speech Synthesistext-to-speech	CodeCode Available	1
Visualization and Interpretation of Latent Spaces for Controlling Expressive Speech Synthesis through Audio Analysis	Mar 27, 2019	Emotional Speech SynthesisExpressive Speech Synthesis	CodeCode Available	1
End-to-end Lyrics Alignment for Polyphonic Music Using an Audio-to-Character Recognition Model	Feb 18, 2019	Retrievaltext-to-speech	CodeCode Available	1
Robust universal neural vocoding	Nov 15, 2018	text-to-speechText to Speech	CodeCode Available	1
ClariNet: Parallel Wave Generation in End-to-End Text-to-Speech	Jul 19, 2018	Speech Synthesistext-to-speech	CodeCode Available	1
Attentive Sequence-to-Sequence Learning for Diacritic Restoration of Yorùbá Language Text	Apr 3, 2018	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	1
Efficiently Trainable Text-to-Speech System Based on Deep Convolutional Networks with Guided Attention	Oct 24, 2017	text-to-speechText to Speech	CodeCode Available	1
VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop	Jul 20, 2017	Sentencetext-to-speech	CodeCode Available	1
Tacotron: Towards End-to-End Speech Synthesis	Mar 29, 2017	Audio SynthesisSpeech Synthesis	CodeCode Available	1
WaveNet: A Generative Model for Raw Audio	Sep 12, 2016	Audio Generationmodel	CodeCode Available	1
Hear Your Code Fail, Voice-Assisted Debugging for Python	Jul 20, 2025	CPUMedical Diagnosis	—Unverified	0
NonverbalTTS: A Public English Corpus of Text-Aligned Nonverbal Vocalizations with Emotion Annotations for Text-to-Speech	Jul 17, 2025	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
P.808 Multilingual Speech Enhancement Testing: Approach and Results of URGENT 2025 Challenge	Jul 15, 2025	Speech Enhancementtext-to-speech	—Unverified	0
An Empirical Evaluation of AI-Powered Non-Player Characters' Perceived Realism and Performance in Virtual Reality Environments	Jul 14, 2025	Speech-to-Texttext-to-speech	—Unverified	0
Exploiting Leaderboards for Large-Scale Distribution of Malicious Models	Jul 11, 2025	Model DiscoveryText Generation	—Unverified	0
MIDI-VALLE: Improving Expressive Piano Performance Synthesis Through Neural Codec Language Modelling	Jul 11, 2025	Audio SynthesisLanguage Modelling	—Unverified	0
Speech Quality Assessment Model Based on Mixture of Experts: System-Level Performance Enhancement and Utterance-Level Challenge Analysis	Jul 8, 2025	Data AugmentationMixture-of-Experts	—Unverified	0
An Exploration of ECAPA-TDNN and x-vector Speaker Representations in Zero-shot Multi-speaker TTS	Jun 25, 2025	Speaker Recognitiontext-to-speech	—Unverified	0
TTSDS2: Resources and Benchmark for Evaluating Human-Quality Text to Speech Systems	Jun 24, 2025	text-to-speechText to Speech	—Unverified	0
LM-SPT: LM-Aligned Semantic Distillation for Speech Tokenization	Jun 20, 2025	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
Optimizing Multilingual Text-To-Speech with Accents & Emotions	Jun 19, 2025	DisentanglementEmotion Recognition	—Unverified	0
Streaming Non-Autoregressive Model for Accent Conversion and Pronunciation Improvement	Jun 19, 2025	text-to-speechText to Speech	—Unverified	0
PredGen: Accelerated Inference of Large Language Models through Input-Time Speculation for Real-Time Speech Interaction	Jun 18, 2025	Sentencetext-to-speech	—Unverified	0
EmoNews: A Spoken Dialogue System for Expressive News Conversations	Jun 16, 2025	Language ModelingLanguage Modelling	CodeCode Available	0
StreamMel: Real-Time Zero-shot Text-to-Speech via Interleaved Continuous Autoregressive Modeling	Jun 14, 2025	text-to-speechText to Speech	—Unverified	0
Phonikud: Hebrew Grapheme-to-Phoneme Conversion for Real-Time Text-to-Speech	Jun 14, 2025	Grapheme-to-Phoneme Conversiontext-to-speech	—Unverified	0
Scheduled Interleaved Speech-Text Training for Speech-to-Speech Translation with LLMs	Jun 12, 2025	Speech-to-Speech Translationtext-to-speech	—Unverified	0
S2ST-Omni: An Efficient and Scalable Multilingual Speech-to-Speech Translation Framework via Seamless Speech-Text Alignment and Streaming Speech Generation	Jun 11, 2025	Reading ComprehensionSpeech Synthesis	—Unverified	0
UmbraTTS: Adapting Text-to-Speech to Environmental Contexts with Flow Matching	Jun 11, 2025	Speech Synthesistext-to-speech	—Unverified	0
A Self-Refining Framework for Enhancing ASR Using TTS-Synthesized Data	Jun 10, 2025	text-to-speechText to Speech	—Unverified	0
Seeing Voices: Generating A-Roll Video from Audio with Mirage	Jun 9, 2025	Speech Synthesistext-to-speech	—Unverified	0
Transcript-Prompted Whisper with Dictionary-Enhanced Decoding for Japanese Speech Annotation	Jun 9, 2025	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
Voice Impression Control in Zero-Shot TTS	Jun 6, 2025	Language ModelingLanguage Modelling	—Unverified	0
Intelligibility of Text-to-Speech Systems for Mathematical Expressions	Jun 5, 2025	text-to-speechText to Speech	—Unverified	0
Grapheme-Coherent Phonemic and Prosodic Annotation of Speech by Implicit and Explicit Grapheme Conditioning	Jun 5, 2025	text-to-speechText to Speech	—Unverified	0
Can we reconstruct a dysarthric voice with the large speech model Parler TTS?	Jun 4, 2025	text-to-speechText to Speech	—Unverified	0
A Novel Data Augmentation Approach for Automatic Speaking Assessment on Opinion Expressions	Jun 4, 2025	Data AugmentationDiversity	—Unverified	0
BitTTS: Highly Compact Text-to-Speech Using 1.58-bit Quantization and Weight Indexing	Jun 4, 2025	Quantizationtext-to-speech	—Unverified	0
UniCUE: Unified Recognition and Generation Framework for Chinese Cued Speech Video-to-Speech Generation	Jun 4, 2025	cross-modal alignmentLipreading	—Unverified	0

Show:10 25 50

← PrevPage 6 of 29Next →

No leaderboard results yet.