Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 301–350 of 1419 papers

Title	Date	Tasks	Status	Score
Betray Oneself: A Novel Audio DeepFake Detection Model via Mono-to-Stereo Conversion	May 25, 2023	Audio Deepfake DetectionDeepFake Detection	CodeCode Available	5
An Open Source Web Reader for Under-Resourced Languages	Jun 1, 2022	text-to-speechText to Speech	CodeCode Available	5
Anonymizing Speech with Generative Adversarial Networks to Preserve Speaker Privacy	Oct 13, 2022	Generative Adversarial NetworkSpeaker anonymization	CodeCode Available	5
ObamaNet: Photo-realistic lip-sync from text	Dec 6, 2017	Constrained Lip-synchronizationtext-to-speech	CodeCode Available	5
SpikeVoice: High-Quality Text-to-Speech Via Efficient Spiking Neural Network	Jul 17, 2024	text-to-speechText to Speech	CodeCode Available	5
Numbers Normalisation in the Inflected Languages: a Case Study of Polish	Aug 1, 2019	text-to-speechText to Speech	CodeCode Available	5
Bayesian Parameter-Efficient Fine-Tuning for Overcoming Catastrophic Forgetting	Feb 19, 2024	Language ModelingLanguage Modelling	CodeCode Available	5
An investigation of phrase break prediction in an End-to-End TTS system	Apr 9, 2023	Predictiontext-to-speech	CodeCode Available	5
BanglaFake: Constructing and Evaluating a Specialized Bengali Deepfake Audio Dataset	May 16, 2025	DeepFake DetectionFace Swapping	CodeCode Available	5
Neural Voice Puppetry: Audio-driven Facial Reenactment	Dec 11, 2019	Face ModelNeural Rendering	CodeCode Available	5
A wearable sensor vest for social humanoid robots with GPGPU, IoT, and modular software architecture	Jan 6, 2022	Speech-to-Texttext-to-speech	CodeCode Available	5
Multimodal Latent Language Modeling with Next-Token Diffusion	Dec 11, 2024	Image GenerationLanguage Modeling	CodeCode Available	5
Multi-modal and Multi-scale Spatial Environment Understanding for Immersive Visual Text-to-Speech	Dec 16, 2024	text-to-speechText to Speech	CodeCode Available	5
Multi-Source Spatial Knowledge Understanding for Immersive Visual Text-to-Speech	Oct 18, 2024	object-detectionObject Detection	CodeCode Available	5
MLS: A Large-Scale Multilingual Dataset for Speech Research	Dec 7, 2020	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	5
Meta Learning Text-to-Speech Synthesis in over 7000 Languages	Jun 10, 2024	Meta-LearningSpeech Synthesis	CodeCode Available	5
MelNet: A Generative Model for Audio in the Frequency Domain	Jun 4, 2019	Audio GenerationMusic Generation	CodeCode Available	5
Mlphon: A Multifunctional Grapheme-Phoneme Conversion Tool Using Finite State Transducers	Sep 5, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	5
Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning	Oct 20, 2017	GPUSpeech Synthesis	CodeCode Available	5
Deep Voice 2: Multi-Speaker Neural Text-to-Speech	May 24, 2017	Speech Synthesistext-to-speech	CodeCode Available	5
Luganda Text-to-Speech Machine	May 11, 2020	text-to-speechText to Speech	CodeCode Available	5
Location-Relative Attention Mechanisms For Robust Long-Form Speech Synthesis	Oct 23, 2019	FormSpeech Synthesis	CodeCode Available	5
Low-Resource Multilingual and Zero-Shot Multispeaker TTS	Oct 21, 2022	Meta-Learningtext-to-speech	CodeCode Available	5
LibriS2S: A German-English Speech-to-Speech Translation Corpus	Apr 22, 2022	Speech-to-Speech TranslationSpeech-to-Text	CodeCode Available	5
Let's Give a Voice to Conversational Agents in Virtual Reality	Aug 4, 2023	Speech-to-Texttext-to-speech	CodeCode Available	5
Learning Speaker Embedding from Text-to-Speech	Oct 21, 2020	ClassificationDecoder	CodeCode Available	5
Audio Super Resolution using Neural Networks	Aug 2, 2017	Audio GenerationAudio Super-Resolution	CodeCode Available	5
Learning High-Frequency Functions Made Easy with Sinusoidal Positional Encoding	Jul 12, 2024	regressiontext-to-speech	CodeCode Available	5
MaSS: A Large and Clean Multilingual Corpus of Sentence-aligned Spoken Utterances Extracted from the Bible	Jul 30, 2019	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	5
Language-Agnostic Meta-Learning for Low-Resource Text-to-Speech with Articulatory Features	Mar 7, 2022	Meta-Learningtext-to-speech	CodeCode Available	5
JSSS: free Japanese speech corpus for summarization and simplification	Oct 5, 2020	FormSpeech Synthesis	CodeCode Available	5
Latent Optimal Paths by Gumbel Propagation for Variational Bayesian Dynamic Programming	Jun 5, 2023	Bayesian InferenceSinging Voice Synthesis	CodeCode Available	5
"I've Heard of You!": Generate Spoken Named Entity Recognition Data for Unseen Entities	Dec 26, 2024	Domain AdaptationLanguage Modeling	CodeCode Available	5
IsoChronoMeter: A simple and effective isochronic translation evaluation metric	Oct 14, 2024	Machine Translationtext-to-speech	CodeCode Available	5
Integrated Speech and Gesture Synthesis	Aug 25, 2021	Speech Synthesistext-to-speech	CodeCode Available	5
Investigating on Incorporating Pretrained and Learnable Speaker Representations for Multi-Speaker Multi-Style Text-to-Speech	Mar 6, 2021	text-to-speechText to Speech	CodeCode Available	5
Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language	Oct 29, 2018	Speech Synthesistext-to-speech	CodeCode Available	5
CSS10: A Collection of Single Speaker Speech Datasets for 10 Languages	Mar 27, 2019	text-to-speechText to Speech	CodeCode Available	5
Independent and automatic evaluation of acoustic-to-articulatory inversion models	Nov 15, 2019	speech-recognitionSpeech Recognition	CodeCode Available	5
Improving LPCNet-based Text-to-Speech with Linear Prediction-structured Mixture Density Network	Jan 31, 2020	QuantizationSpeech Synthesis	CodeCode Available	5
Massively Multilingual Neural Grapheme-to-Phoneme Conversion	Aug 4, 2017	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	5
Naturalization of Text by the Insertion of Pauses and Filler Words	Nov 7, 2020	Sentencetext-to-speech	CodeCode Available	5
High Fidelity Speech Synthesis with Adversarial Networks	Sep 25, 2019	Generative Adversarial NetworkSpeech Synthesis	CodeCode Available	5
Humane Speech Synthesis through Zero-Shot Emotion and Disfluency Generation	Mar 31, 2024	Language ModelingLanguage Modelling	CodeCode Available	5
Hierarchical Generative Modeling for Controllable Speech Synthesis	Oct 16, 2018	AttributeSpeech Synthesis	CodeCode Available	5
Hierarchical Prosody Modeling for Non-Autoregressive Speech Synthesis	Nov 12, 2020	Speech Synthesistext-to-speech	CodeCode Available	5
Cross-Modal Generalization: Learning in Low Resource Modalities via Meta-Alignment	Dec 4, 2020	Meta-Learningtext-to-speech	CodeCode Available	5
Attentive Multi-Layer Perceptron for Non-autoregressive Generation	Oct 14, 2023	Machine TranslationSpeech Synthesis	CodeCode Available	5
Generating Data with Text-to-Speech and Large-Language Models for Conversational Speech Recognition	Aug 17, 2024	Language ModelingLanguage Modelling	CodeCode Available	5
Generating Synthetic Audio Data for Attention-Based Speech Recognition Systems	Dec 19, 2019	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	5

Show:10 25 50

← PrevPage 7 of 29Next →

No leaderboard results yet.