Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 101–125 of 1419 papers

Title	Date	Tasks	Status	Hype
Empowering Global Voices: A Data-Efficient, Phoneme-Tone Adaptive Approach to High-Fidelity Speech Synthesis	Apr 10, 2025	Speech Synthesistext-to-speech	—Unverified	0
SpeakEasy: Enhancing Text-to-Speech Interactions for Expressive Content Creation	Apr 7, 2025	text-to-speechText to Speech	—Unverified	0
RWKVTTS: Yet another TTS based on RWKV-7	Apr 4, 2025	Computational Efficiencytext-to-speech	CodeCode Available	2
TeleAntiFraud-28k: An Audio-Text Slow-Thinking Dataset for Telecom Fraud Detection	Mar 31, 2025	Fraud DetectionLarge Language Model	CodeCode Available	2
Speculative End-Turn Detector for Efficient Speech Chatbot Assistant	Mar 30, 2025	ChatbotCollaborative Inference	—Unverified	0
SupertonicTTS: Towards Highly Scalable and Efficient Text-to-Speech System	Mar 29, 2025	Speech Synthesistext-to-speech	—Unverified	0
DeepAudio-V1:Towards Multi-Modal Multi-Stage End-to-End Video to Speech and Audio Generation	Mar 28, 2025	Audio GenerationAudio-Visual Synchronization	—Unverified	0
Dual Audio-Centric Modality Coupling for Talking Head Generation	Mar 26, 2025	NeRFTalking Head Generation	—Unverified	0
Your voice is your voice: Supporting Self-expression through Speech Generation and LLMs in Augmented and Alternative Communication	Mar 21, 2025	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
MoonCast: High-Quality Zero-Shot Podcast Generation	Mar 18, 2025	Speech Synthesistext-to-speech	CodeCode Available	3
MAVFlow: Preserving Paralinguistic Elements with Conditional Flow Matching for Zero-Shot AV2AV Multilingual Translation	Mar 14, 2025	text-to-speechText to Speech	—Unverified	0
An Exhaustive Evaluation of TTS- and VC-based Data Augmentation for ASR	Mar 11, 2025	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
VocalEyes: Enhancing Environmental Perception for the Visually Impaired through Vision-Language Models and Distance-Aware Object Detection	Mar 10, 2025	NVIDIA Jetson Orin Nanoobject-detection	—Unverified	0
Scaling Rich Style-Prompted Text-to-Speech Datasets	Mar 6, 2025	Language ModelingLanguage Modelling	CodeCode Available	2
InSerter: Speech Instruction Following with Unsupervised Interleaved Pre-training	Mar 4, 2025	Instruction Followingtext-to-speech	—Unverified	0
Direct Speech to Speech Translation: A Review	Mar 3, 2025	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech Tokens	Mar 3, 2025	Attributetext-to-speech	CodeCode Available	11
UniWav: Towards Unified Pre-training for Speech Representation Learning and Generation	Mar 2, 2025	DecoderRepresentation Learning	—Unverified	0
Telephone Surveys Meet Conversational AI: Evaluating a LLM-Based Telephone Survey System at Scale	Feb 27, 2025	AI AgentLarge Language Model	—Unverified	0
Nexus: An Omni-Perceptive And -Interactive Model for Language, Audio, And Vision	Feb 26, 2025	Audio SynthesisAutomatic Speech Recognition	—Unverified	0
MegaTTS 3: Sparse Alignment Enhanced Latent Diffusion Transformer for Zero-Shot Speech Synthesis	Feb 26, 2025	Speech Synthesistext-to-speech	—Unverified	0
Clip-TTS: Contrastive Text-content and Mel-spectrogram, A High-Quality Text-to-Speech Method based on Contextual Semantic Understanding	Feb 26, 2025	text-to-speechText to Speech	—Unverified	0
Balancing Speech Understanding and Generation Using Continual Pre-training for Codec-based Speech LLM	Feb 24, 2025	Automatic Speech RecognitionLanguage Modeling	—Unverified	0
NaturalL2S: End-to-End High-quality Multispeaker Lip-to-Speech Synthesis with Differential Digital Signal Processing	Feb 17, 2025	Lip to Speech Synthesisspeech-recognition	—Unverified	0
SyncSpeech: Low-Latency and Efficient Dual-Stream Text-to-Speech based on Temporal Masked Transformer	Feb 16, 2025	text-to-speechText to Speech	—Unverified	0

Show:10 25 50

← PrevPage 5 of 57Next →

No leaderboard results yet.