Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1351–1400 of 1419 papers

Title	Date	Tasks	Status
RNN Approaches to Text Normalization: A Challenge	Oct 31, 2016	Text Normalizationtext-to-speech	CodeCode Available
Phrase break prediction with bidirectional encoder representations in Japanese text-to-speech synthesis	Apr 26, 2021	Language ModelingLanguage Modelling	CodeCode Available
Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale	Jun 23, 2023	In-Context LearningSpeech Synthesis	CodeCode Available
A Comparative Study on Transformer vs RNN in Speech Applications	Sep 13, 2019	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available
Spoofing Speaker Verification Systems with Deep Multi-speaker Text-to-speech Synthesis	Oct 29, 2019	Speaker VerificationSpeech Synthesis	CodeCode Available
Non-Autoregressive Neural Text-to-Speech	May 21, 2019	text-to-speechText to Speech	CodeCode Available
ObamaNet: Photo-realistic lip-sync from text	Dec 6, 2017	Constrained Lip-synchronizationtext-to-speech	CodeCode Available
AlignTTS: Efficient Feed-Forward Text-to-Speech System without Explicit Alignment	Mar 4, 2020	text-to-speechText to Speech	CodeCode Available
Numbers Normalisation in the Inflected Languages: a Case Study of Polish	Aug 1, 2019	text-to-speechText to Speech	CodeCode Available
ASSERT: Anti-Spoofing with Squeeze-Excitation and Residual neTworks	Apr 1, 2019	Feature Engineeringtext-to-speech	CodeCode Available
BanglaFake: Constructing and Evaluating a Specialized Bengali Deepfake Audio Dataset	May 16, 2025	DeepFake DetectionFace Swapping	CodeCode Available
Neural Voice Puppetry: Audio-driven Facial Reenactment	Dec 11, 2019	Face ModelNeural Rendering	CodeCode Available
Integrated Speech and Gesture Synthesis	Aug 25, 2021	Speech Synthesistext-to-speech	CodeCode Available
Text-to-Video: a Two-stage Framework for Zero-shot Identity-agnostic Talking-head Generation	Aug 12, 2023	Talking Head Generationtext-to-speech	CodeCode Available
Independent and automatic evaluation of acoustic-to-articulatory inversion models	Nov 15, 2019	speech-recognitionSpeech Recognition	CodeCode Available
Statistical Parametric Speech Synthesis Incorporating Generative Adversarial Networks	Sep 23, 2017	Speech Synthesistext-to-speech	CodeCode Available
Naturalization of Text by the Insertion of Pauses and Filler Words	Nov 7, 2020	Sentencetext-to-speech	CodeCode Available
Humane Speech Synthesis through Zero-Shot Emotion and Disfluency Generation	Mar 31, 2024	Language ModelingLanguage Modelling	CodeCode Available
Deep Voice 2: Multi-Speaker Neural Text-to-Speech	May 24, 2017	Speech Synthesistext-to-speech	CodeCode Available
SaSLaW: Dialogue Speech Corpus with Audio-visual Egocentric Information Toward Environment-adaptive Dialogue Speech Synthesis	Aug 13, 2024	Speech SynthesisSpoken Dialogue Systems	CodeCode Available
Robust and Unbounded Length Generalization in Autoregressive Transformer-Based Text-to-Speech	Oct 29, 2024	Decodertext-to-speech	CodeCode Available
CSS10: A Collection of Single Speaker Speech Datasets for 10 Languages	Mar 27, 2019	text-to-speechText to Speech	CodeCode Available
AraSpot: Arabic Spoken Command Spotting	Mar 29, 2023	Data AugmentationKeyword Spotting	CodeCode Available
Multi-Source Spatial Knowledge Understanding for Immersive Visual Text-to-Speech	Oct 18, 2024	object-detectionObject Detection	CodeCode Available
Multimodal Latent Language Modeling with Next-Token Diffusion	Dec 11, 2024	Image GenerationLanguage Modeling	CodeCode Available
Cross-Modal Generalization: Learning in Low Resource Modalities via Meta-Alignment	Dec 4, 2020	Meta-Learningtext-to-speech	CodeCode Available
Multi-modal and Multi-scale Spatial Environment Understanding for Immersive Visual Text-to-Speech	Dec 16, 2024	text-to-speechText to Speech	CodeCode Available
Continuous Speech Tokenizer in Text To Speech	Oct 22, 2024	Language ModelingLanguage Modelling	CodeCode Available
MLS: A Large-Scale Multilingual Dataset for Speech Research	Dec 7, 2020	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available
Mlphon: A Multifunctional Grapheme-Phoneme Conversion Tool Using Finite State Transducers	Sep 5, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis	Jun 12, 2018	Speaker VerificationSpeech Synthesis	CodeCode Available
High Fidelity Speech Synthesis with Adversarial Networks	Sep 25, 2019	Generative Adversarial NetworkSpeech Synthesis	CodeCode Available
A wearable sensor vest for social humanoid robots with GPGPU, IoT, and modular software architecture	Jan 6, 2022	Speech-to-Texttext-to-speech	CodeCode Available
Emilia: An Extensive, Multilingual, and Diverse Speech Dataset for Large-Scale Speech Generation	Jul 7, 2024	Text to Speech	CodeCode Available
Adaptation of Tacotron2-based Text-To-Speech for Articulatory-to-Acoustic Mapping using Ultrasound Tongue Imaging	Jul 26, 2021	text-to-speechText to Speech	CodeCode Available
VIFS: An End-to-End Variational Inference for Foley Sound Synthesis	Jun 8, 2023	Speech Synthesistext-to-speech	CodeCode Available
Semantic Mask for Transformer based End-to-End Speech Recognition	Dec 6, 2019	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available
Effective parameter estimation methods for an ExcitNet model in generative text-to-speech systems	May 21, 2019	parameter estimationSpeech Synthesis	CodeCode Available
The Sequence-to-Sequence Baseline for the Voice Conversion Challenge 2020: Cascading ASR and TTS	Oct 6, 2020	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available
Hierarchical Generative Modeling for Controllable Speech Synthesis	Oct 16, 2018	AttributeSpeech Synthesis	CodeCode Available
MelNet: A Generative Model for Audio in the Frequency Domain	Jun 4, 2019	Audio GenerationMusic Generation	CodeCode Available
Using generative modelling to produce varied intonation for speech synthesis	Jun 10, 2019	SentenceSpeech Synthesis	CodeCode Available
Applying Phonological Features in Multilingual Text-To-Speech	Oct 7, 2021	Language Acquisitiontext-to-speech	CodeCode Available
Massively Multilingual Neural Grapheme-to-Phoneme Conversion	Aug 4, 2017	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available
MaSS: A Large and Clean Multilingual Corpus of Sentence-aligned Spoken Utterances Extracted from the Bible	Jul 30, 2019	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available
StyleSpeech: Parameter-efficient Fine Tuning for Pre-trained Controllable Text-to-Speech	Aug 27, 2024	parameter-efficient fine-tuningtext-to-speech	CodeCode Available
Sequence Transduction with Recurrent Neural Networks	Nov 14, 2012	Machine TranslationPhoneme Recognition	CodeCode Available
Audio Super Resolution using Neural Networks	Aug 2, 2017	Audio GenerationAudio Super-Resolution	CodeCode Available
Generating Synthetic Speech from SpokenVocab for Speech Translation	Oct 15, 2022	Data AugmentationMachine Translation	CodeCode Available
Generating Data with Text-to-Speech and Large-Language Models for Conversational Speech Recognition	Aug 17, 2024	Language ModelingLanguage Modelling	CodeCode Available

Show:10 25 50

← PrevPage 28 of 29Next →

No leaderboard results yet.