Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 501–550 of 1419 papers

Title	Date	Tasks	Status
An overview of text-to-speech systems and media applications	Oct 22, 2023	Acoustic Modellingtext-to-speech	—Unverified
Evaluating Text-to-Speech Synthesis from a Large Discrete Token-based Speech Language Model	May 16, 2024	HallucinationLanguage Modeling	—Unverified
Efficient Generative Modeling with Residual Vector Quantization-Based Tokens	Dec 13, 2024	Conditional Image GenerationImage Generation	—Unverified
Explicit Intensity Control for Accented Text-to-speech	Oct 27, 2022	speech-recognitionSpeech Recognition	—Unverified
Efficient data selection employing Semantic Similarity-based Graph Structures for model training	Feb 22, 2024	Semantic SimilaritySemantic Textual Similarity	—Unverified
Exploiting Transliterated Words for Finding Similarity in Inter-Language News Articles using Machine Learning	May 29, 2022	ArticlesMachine Translation	—Unverified
Exploring an Inter-Pausal Unit (IPU) based Approach for Indic End-to-End TTS Systems	Sep 18, 2024	Sentencetext-to-speech	—Unverified
Exploring Machine Speech Chain for Domain Adaptation and Few-Shot Speaker Adaptation	Apr 8, 2021	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Exploring Speech Enhancement for Low-resource Speech Synthesis	Sep 19, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Exploring speech style spaces with language models: Emotional TTS without emotion labels	May 18, 2024	text-to-speechText to Speech	—Unverified
Boosting Diffusion Model for Spectrogram Up-sampling in Text-to-speech: An Empirical Study	Jun 7, 2024	DiversityLanguage Modeling	—Unverified
Effect of choice of probability distribution, randomness, and search methods for alignment modeling in sequence-to-sequence text-to-speech synthesis using hard alignment	Oct 28, 2019	Hard AttentionSpeech Synthesis	—Unverified
BOFFIN TTS: Few-Shot Speaker Adaptation by Bayesian Optimization	Feb 4, 2020	Bayesian Optimizationtext-to-speech	—Unverified
An Overview of Affective Speech Synthesis and Conversion in the Deep Learning Era	Oct 6, 2022	Speech Synthesistext-to-speech	—Unverified
Adversarial Speaker-Consistency Learning Using Untranscribed Speech Data for Zero-Shot Multi-Speaker Text-to-Speech	Oct 12, 2022	text-to-speechText to Speech	—Unverified
Effectiveness of text to speech pseudo labels for forced alignment and cross lingual pretrained models for low resource speech recognition	Mar 31, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
BiVocoder: A Bidirectional Neural Vocoder Integrating Feature Extraction and Waveform Generation	Jun 4, 2024	text-to-speechText to Speech	—Unverified
Effective Decoder Masking for Transformer Based End-to-End Speech Recognition	Oct 27, 2020	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
BitTTS: Highly Compact Text-to-Speech Using 1.58-bit Quantization and Weight Indexing	Jun 4, 2025	Quantizationtext-to-speech	—Unverified
A Novel Data Augmentation Approach for Automatic Speaking Assessment on Opinion Expressions	Jun 4, 2025	Data AugmentationDiversity	—Unverified
Easy, Interpretable, Effective: openSMILE for voice deepfake detection	Aug 28, 2024	DeepFake DetectionFace Swapping	—Unverified
E3 TTS: Easy End-to-End Diffusion-based Text to Speech	Nov 2, 2023	text-to-speechText to Speech	—Unverified
A Novel Chinese Dialect TTS Frontend with Non-Autoregressive Neural Machine Translation	Jun 10, 2022	Machine Translationtext-to-speech	—Unverified
Adversarial Attacks and Robust Defenses in Speaker Embedding based Zero-Shot Text-to-Speech System	Oct 5, 2024	Adversarial PurificationSpeech Synthesis	—Unverified
Scheduled Interleaved Speech-Text Training for Speech-to-Speech Translation with LLMs	Jun 12, 2025	Speech-to-Speech Translationtext-to-speech	—Unverified
E1 TTS: Simple and Fast Non-Autoregressive TTS	Sep 14, 2024	Denoisingtext-to-speech	—Unverified
Dynamic Prosody Generation for Speech Synthesis using Linguistics-Driven Acoustic Embedding Selection	Dec 2, 2019	Speech Synthesistext-to-speech	—Unverified
DurIAN-E: Duration Informed Attention Network For Expressive Text-to-Speech Synthesis	Sep 22, 2023	DenoisingSpeech Synthesis	—Unverified
Beyond Text-to-Text: An Overview of Multimodal and Generative Artificial Intelligence for Education Using Topic Modeling	Sep 24, 2024	Articlestext-to-speech	—Unverified
A Novel Approach to OCR using Image Recognition based Classification for Ancient Tamil Inscriptions in Temples	Jul 4, 2019	BinarizationGeneral Classification	—Unverified
DurIAN-E 2: Duration Informed Attention Network with Adaptive Variational Autoencoder and Adversarial Learning for Expressive Text-to-Speech Synthesis	Oct 17, 2024	Speech Synthesistext-to-speech	—Unverified
Duration-aware pause insertion using pre-trained language model for multi-speaker text-to-speech	Feb 27, 2023	Language ModelingLanguage Modelling	—Unverified
DubWise: Video-Guided Speech Duration Control in Multimodal LLM-based Text-to-Speech for Dubbing	Jun 13, 2024	Language ModelingLanguage Modelling	—Unverified
Advancing NAM-to-Speech Conversion with Novel Methods and the MultiNAM Dataset	Dec 25, 2024	text-to-speechText to Speech	—Unverified
Dual Supervised Learning	Jul 3, 2017	General Classificationimage-classification	—Unverified
DualSpeech: Enhancing Speaker-Fidelity and Text-Intelligibility Through Dual Classifier-Free Guidance	Aug 26, 2024	Diversitytext-to-speech	—Unverified
BERT, can HE predict contrastive focus? Predicting and controlling prominence in neural TTS using a language model	Jul 4, 2022	Language ModelingLanguage Modelling	—Unverified
Dual Script E2E framework for Multilingual and Code-Switching ASR	Jun 2, 2021	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Dual Audio-Centric Modality Coupling for Talking Head Generation	Mar 26, 2025	NeRFTalking Head Generation	—Unverified
Anonymizing Speech with Generative Adversarial Networks to Preserve Speaker Privacy	Oct 13, 2022	Generative Adversarial NetworkSpeaker anonymization	—Unverified
DTW-SiameseNet: Dynamic Time Warped Siamese Network for Mispronunciation Detection and Correction	Mar 1, 2023	Dynamic Time WarpingMetric Learning	—Unverified
DSE-TTS: Dual Speaker Embedding for Cross-Lingual Text-to-Speech	Jun 25, 2023	Speech Synthesistext-to-speech	—Unverified
Benchmarking Expressive Japanese Character Text-to-Speech with VITS and Style-BERT-VITS2	May 22, 2025	BenchmarkingDialogue Generation	—Unverified
DPP-TTS: Diversifying prosodic features of speech via determinantal point processes	Oct 23, 2023	DiversityPoint Processes	—Unverified
LAraBench: Benchmarking Arabic AI with Large Language Models	May 24, 2023	BenchmarkingFew-Shot Learning	—Unverified
An objective evaluation of the effects of recording conditions and speaker characteristics in multi-speaker deep neural speech synthesis	Jun 3, 2021	Speaker VerificationSpeech Synthesis	—Unverified
Empowering Communication: Speech Technology for Indian and Western Accents through AI-powered Speech Synthesis	Jan 22, 2024	Speaker VerificationSpeech Synthesis	—Unverified
DPI-TTS: Directional Patch Interaction for Fast-Converging and Style Temporal Modeling in Text-to-Speech	Sep 18, 2024	text-to-speechText to Speech	—Unverified
Do Prosody Transfer Models Transfer Prosody?	Mar 7, 2023	Speech Synthesistext-to-speech	—Unverified
Does Audio Deepfake Detection Generalize?	Mar 30, 2022	Audio Deepfake DetectionDeepFake Detection	—Unverified

Show:10 25 50

← PrevPage 11 of 29Next →

No leaderboard results yet.