Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 101–150 of 1419 papers

Title	Date	Tasks	Status	Hype	Score
PresentAgent: Multimodal Agent for Presentation Video Generation	Jul 5, 2025	text-to-speechText to Speech	CodeCode Available	2	5
Nix-TTS: Lightweight and End-to-End Text-to-Speech via Module-wise Distillation	Mar 29, 2022	CPUDecoder	CodeCode Available	2	5
Accelerating Diffusion-based Text-to-Speech Model Training with Dual Modality Alignment	May 26, 2025	text-to-speechText to Speech	CodeCode Available	2	5
LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPT	Oct 7, 2023	Audio captioningAutomatic Speech Recognition	CodeCode Available	2	5
NaturalSpeech 2: Latent Diffusion Models are Natural and Zero-Shot Speech and Singing Synthesizers	Apr 18, 2023	In-Context LearningSpeech Synthesis	CodeCode Available	2	5
RWKVTTS: Yet another TTS based on RWKV-7	Apr 4, 2025	Computational Efficiencytext-to-speech	CodeCode Available	2	5
SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models	Aug 31, 2023	DecoderLanguage Modeling	CodeCode Available	2	5
Meta-TTS: Meta-Learning for Few-Shot Speaker Adaptive Text-to-Speech	Nov 7, 2021	Meta-LearningSpeech Synthesis	CodeCode Available	1	5
g2pM: A Neural Grapheme-to-Phoneme Conversion Package for Mandarin Chinese Based on a New Open Benchmark Dataset	Apr 7, 2020	Grapheme-to-Phoneme ConversionPolyphone disambiguation	CodeCode Available	1	5
Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations	Mar 3, 2023	Speech DenoisingSpeech Enhancement	CodeCode Available	1	5
MathReader : Text-to-Speech for Mathematical Documents	Jan 13, 2025	Optical Character Recognition (OCR)text-to-speech	CodeCode Available	1	5
From Tens of Hours to Tens of Thousands: Scaling Back-Translation for Speech Recognition	May 22, 2025	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	1	5
FMFCC-A: A Challenging Mandarin Dataset for Synthetic Speech Detection	Oct 18, 2021	Speech SynthesisSynthetic Speech Detection	CodeCode Available	1	5
Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation	Jun 6, 2021	text-to-speechText to Speech	CodeCode Available	1	5
FCTalker: Fine and Coarse Grained Context Modeling for Expressive Conversational Speech Synthesis	Oct 27, 2022	Speech Synthesistext-to-speech	CodeCode Available	1	5
Fine-grained style control in Transformer-based Text-to-speech Synthesis	Oct 12, 2021	Inductive BiasSpeech Synthesis	CodeCode Available	1	5
Making More of Little Data: Improving Low-Resource Automatic Speech Recognition Using Data Augmentation	May 18, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	1	5
FastPitchFormant: Source-filter based Decomposed Modeling for Speech Synthesis	Jun 29, 2021	Speech Synthesistext-to-speech	CodeCode Available	1	5
FastPitch: Parallel Text-to-speech with Pitch Prediction	Jun 11, 2020	Predictiontext-to-speech	CodeCode Available	1	5
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech	Jun 8, 2020	Knowledge DistillationSpeech Synthesis	CodeCode Available	1	5
From Speaker Verification to Multispeaker Speech Synthesis, Deep Transfer with Feedback Constraint	May 10, 2020	Speaker VerificationSpeech Synthesis	CodeCode Available	1	5
ALIF: Low-Cost Adversarial Audio Attacks on Black-Box Speech Platforms using Linguistic Features	Aug 3, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	1	5
Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis	May 12, 2020	Speech SynthesisStyle Transfer	CodeCode Available	1	5
Textless Unit-to-Unit training for Many-to-Many Multilingual Speech-to-Speech Translation	Aug 3, 2023	DecoderQuantization	CodeCode Available	1	5
Mitigating Unauthorized Speech Synthesis for Voice Protection	Oct 28, 2024	Data AugmentationFace Swapping	CodeCode Available	1	5
Evaluating Speech Synthesis by Training Recognizers on Synthetic Speech	Oct 1, 2023	speech-recognitionSpeech Recognition	CodeCode Available	1	5
Evaluating Parameter-Efficient Transfer Learning Approaches on SURE Benchmark for Speech Understanding	Mar 2, 2023	Speech Synthesistext-to-speech	CodeCode Available	1	5
ESPnet-SLU: Advancing Spoken Language Understanding through ESPnet	Nov 29, 2021	Spoken Language Understandingtext-to-speech	CodeCode Available	1	5
LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search	Feb 8, 2021	CPUModel Compression	CodeCode Available	1	5
Limited Data Emotional Voice Conversion Leveraging Text-to-Speech: Two-stage Sequence-to-Sequence Training	Mar 31, 2021	text-to-speechText to Speech	CodeCode Available	1	5
End-to-end Lyrics Alignment for Polyphonic Music Using an Audio-to-Character Recognition Model	Feb 18, 2019	Retrievaltext-to-speech	CodeCode Available	1	5
End to End Lip Synchronization with a Temporal AutoEncoder	Mar 30, 2022	text-to-speechText to Speech	CodeCode Available	1	5
Enhancing Speech Intelligibility in Text-To-Speech Synthesis using Speaking Style Conversion	Aug 13, 2020	Speech Synthesistext-to-speech	CodeCode Available	1	5
A Character-level Span-based Model for Mandarin Prosodic Structure Prediction	Mar 31, 2022	Sentencetext-to-speech	CodeCode Available	1	5
End-to-End Adversarial Text-to-Speech	Jun 5, 2020	Adversarial TextDynamic Time Warping	CodeCode Available	1	5
Emotion-Aware Prosodic Phrasing for Expressive Text-to-Speech	Sep 21, 2023	text-to-speechText to Speech	CodeCode Available	1	5
Learning to Speak from Text: Zero-Shot Multilingual Text-to-Speech with Unsupervised Text Pretraining	Jan 30, 2023	Language ModelingLanguage Modelling	CodeCode Available	1	5
Learning to Dub Movies via Hierarchical Prosody Models	Dec 8, 2022	text-to-speechText to Speech	CodeCode Available	1	5
ShiftySpeech: A Large-Scale Synthetic Speech Dataset with Distribution Shifts	Feb 8, 2025	BenchmarkingSelf-Supervised Learning	CodeCode Available	1	5
Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search	May 22, 2020	text-to-speechText to Speech	CodeCode Available	1	5
EMNS /Imz/ Corpus: An emotive single-speaker dataset for narrative storytelling in games, television and graphic novels	May 22, 2023	Expressive Speech SynthesisSpeech Synthesis	CodeCode Available	1	5
Accurate Emotion Strength Assessment for Seen and Unseen Speech Based on Data-Driven Deep Learning	Jun 15, 2022	AttributeEmotion Classification	CodeCode Available	1	5
EmoSpeech: Guiding FastSpeech2 Towards Emotional Text to Speech	Jun 28, 2023	Emotion RecognitionSpeech Synthesis	CodeCode Available	1	5
Learning Arousal-Valence Representation from Categorical Emotion Labels of Speech	Nov 24, 2023	Dimensionality ReductionEmotion Classification	CodeCode Available	1	5
LlamaPartialSpoof: An LLM-Driven Fake Speech Dataset Simulating Disinformation Generation	Sep 23, 2024	Language ModelingLanguage Modelling	CodeCode Available	1	5
Mixer-TTS: non-autoregressive, fast and compact text-to-speech model conditioned on language model embeddings	Oct 7, 2021	Language ModelingLanguage Modelling	CodeCode Available	1	5
Effective Deep Learning Models for Automatic Diacritization of Arabic Text	Nov 1, 2020	Arabic Text DiacritizationDecoder	CodeCode Available	1	5
EdiTTS: Score-based Editing for Controllable Text-to-Speech	Oct 6, 2021	Speech SynthesisSpeech-to-Text	CodeCode Available	1	5
EditSpeech: A Text Based Speech Editing System Using Partial Inference and Bidirectional Fusion	Jul 4, 2021	text-to-speechText to Speech	CodeCode Available	1	5
Accent Estimation of Japanese Words from Their Surfaces and Romanizations for Building Large Vocabulary Accent Dictionaries	Sep 21, 2020	Sentencetext-to-speech	CodeCode Available	1	5

Show:10 25 50

← PrevPage 3 of 29Next →

No leaderboard results yet.