Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 651–700 of 1419 papers

Title	Date	Tasks	Status
Accented Text-to-Speech Synthesis with Limited Data	May 8, 2023	Speech Synthesistext-to-speech	—Unverified
Data Center Audio/Video Intelligence on Device (DAVID) -- An Edge-AI Platform for Smart-Toys	Nov 18, 2023	text-to-speechText to Speech	—Unverified
Improved Prosodic Clustering for Multispeaker and Speaker-independent Phoneme-level Prosody Control	Nov 19, 2021	ClusteringData Augmentation	—Unverified
Data Augmentation Methods for End-to-end Speech Recognition on Distant-Talk Scenarios	Jun 7, 2021	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
DASB -- Discrete Audio and Speech Benchmark	Jun 20, 2024	BenchmarkingEmotion Recognition	—Unverified
IMaSC -- ICFOSS Malayalam Speech Corpus	Nov 23, 2022	Sentencetext-to-speech	—Unverified
DART: Disentanglement of Accent and Speaker Representation in Multispeaker Text-to-Speech	Oct 17, 2024	DisentanglementQuantization	—Unverified
Analysis and Utilization of Entrainment on Acoustic and Emotion Features in User-agent Dialogue	Dec 7, 2022	Spoken Dialogue Systemstext-to-speech	—Unverified
HybridNet: A Hybrid Neural Architecture to Speed-up Autoregressive Models	Jan 1, 2018	Speech Synthesistext-to-speech	—Unverified
Huqariq: A Multilingual Speech Corpus of Native Languages of Peru forSpeech Recognition	Jun 1, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Daisy-TTS: Simulating Wider Spectrum of Emotions via Prosody Embedding Decomposition	Feb 22, 2024	text-to-speechText to Speech	—Unverified
Huqariq: A Multilingual Speech Corpus of Native Languages of Peru for Speech Recognition	Jul 12, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Human-in-the-loop Speaker Adaptation for DNN-based Multi-speaker TTS	Jun 21, 2022	text-to-speechText to Speech	—Unverified
Cycle-consistency training for end-to-end speech recognition	Nov 2, 2018	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Impact of Frame Rates on Speech Tokenizer: A Case Study on Mandarin and English	May 20, 2025	Automatic Speech Recognitionspeech-recognition	—Unverified
Improve Cross-lingual Voice Cloning Using Low-quality Code-switched Data	Oct 14, 2021	text-to-speechText to Speech	—Unverified
Human Detection of Political Speech Deepfakes across Transcripts, Audio, and Video	Feb 25, 2022	Face SwappingHuman Detection	—Unverified
Customizing Grapheme-to-Phoneme System for Non-Trivial Transcription Problems in Bangla Language	Jun 1, 2019	speech-recognitionSpeech Recognition	—Unverified
Improve few-shot voice cloning using multi-modal learning	Mar 18, 2022	text-to-speechText to Speech	—Unverified
Improving Accent Conversion with Reference Encoder and End-To-End Text-To-Speech	May 19, 2020	text-to-speechText to Speech	—Unverified
AudioJailbreak: Jailbreak Attacks against End-to-End Large Audio-Language Models	May 20, 2025	text-to-speechText to Speech	—Unverified
Improving Audio Codec-based Zero-Shot Text-to-Speech Synthesis with Multi-Modal Context and Large Language Model	Jun 6, 2024	Language ModelingLanguage Modelling	—Unverified
Improving Code-Switching and Named Entity Recognition in ASR with Speech Editing based Data Augmentation	Jun 14, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Improving Contextual Recognition of Rare Words with an Alternate Spelling Prediction Model	Sep 2, 2022	text-to-speechText to Speech	—Unverified
An Algorithm Based on Empirical Methods, for Text-to-Tuneful-Speech Synthesis of Sanskrit Verse	Sep 15, 2014	Speech Synthesistext-to-speech	—Unverified
Improving Deliberation by Text-Only and Semi-Supervised Training	Jun 29, 2022	DecoderLanguage Modeling	—Unverified
HMM-based data augmentation for E2E systems for building conversational speech synthesis systems	Dec 22, 2022	Data AugmentationLanguage Modeling	—Unverified
Improving Grapheme-to-Phoneme Conversion through In-Context Knowledge Retrieval with Large Language Models	Nov 12, 2024	Grapheme-to-Phoneme ConversionRetrieval	—Unverified
CUIfy the XR: An Open-Source Package to Embed LLM-powered Conversational Agents in XR	Nov 7, 2024	Language ModellingLarge Language Model	—Unverified
Improving Low Resource Code-switched ASR using Augmented Code-switched TTS	Oct 12, 2020	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Improving LPCNet-based Text-to-Speech with Linear Prediction-structured Mixture Density Network	Jan 31, 2020	QuantizationSpeech Synthesis	—Unverified
Improving Mandarin Prosodic Structure Prediction with Multi-level Contextual Information	Aug 31, 2023	DecoderMulti-Task Learning	—Unverified
Improving multi-speaker TTS prosody variance with a residual encoder and normalizing flows	Jun 10, 2021	DisentanglementSentence	—Unverified
Improving Noise Robustness of LLM-based Zero-shot TTS via Discrete Acoustic Token Denoising	May 20, 2025	DecoderDenoising	—Unverified
Improving Performance of End-to-End ASR on Numeric Sequences	Jul 1, 2019	speech-recognitionSpeech Recognition	—Unverified
Improving prosodic phrasing of Vietnamese text-to-speech systems	Dec 1, 2020	text-to-speechText to Speech	—Unverified
Improving Prosody Modelling with Cross-Utterance BERT Embeddings for End-to-end Speech Synthesis	Nov 6, 2020	DecoderSentence	—Unverified
Improving Readability for Automatic Speech Recognition Transcription	Apr 9, 2020	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
HLTCOE JHU Submission to the Voice Privacy Challenge 2024	Sep 13, 2024	text-to-speechText to Speech	—Unverified
Improving Robustness of LLM-based Speech Synthesis by Learning Monotonic Alignment	Jun 25, 2024	DecoderLanguage Modeling	—Unverified
Improving Speech-to-Speech Translation Through Unlabeled Text	Oct 26, 2022	Machine Translationspeech-recognition	—Unverified
Improving the expressiveness of neural vocoding with non-affine Normalizing Flows	Jun 16, 2021	text-to-speechText to Speech	—Unverified
Improving the quality of neural TTS using long-form content and multi-speaker multi-style modeling	Dec 20, 2022	Formtext-to-speech	—Unverified
Cued Speech Generation Leveraging a Pre-trained Audiovisual Text-to-Speech Model	Jan 8, 2025	text-to-speechText to Speech	—Unverified
Incorporating speaker embedding and post-filter network for improving speaker similarity of personalized speech synthesis system	Oct 1, 2021	Speaker VerificationSpeech Synthesis	—Unverified
Incremental Disentanglement for Environment-Aware Zero-Shot Text-to-Speech Synthesis	Dec 22, 2024	DecoderDisentanglement	—Unverified
A Survey on Audio Diffusion Models: Text To Speech Synthesis and Enhancement in Generative AI	Mar 23, 2023	Speech EnhancementSpeech Synthesis	—Unverified
Incremental Machine Speech Chain Towards Enabling Listening while Speaking in Real-time	Nov 4, 2020	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
High Quality Streaming Speech Synthesis with Low, Sentence-Length-Independent Latency	Nov 17, 2021	CPUDecoder	—Unverified
High-Quality Automatic Voice Over with Accurate Alignment: Supervision through Self-Supervised Discrete Speech Units	Jun 29, 2023	Speech Synthesistext-to-speech	—Unverified

Show:10 25 50

← PrevPage 14 of 29Next →

No leaderboard results yet.