Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 426–450 of 1419 papers

Title	Date	Tasks	Status
Aligner-Guided Training Paradigm: Advancing Text-to-Speech Models with Aligner Guided Duration	Dec 11, 2024	text-to-speechText to Speech	—Unverified
EmoSpeech: A Corpus of Emotionally Rich and Contextually Detailed Speech Annotations	Dec 9, 2024	text-to-speechText to Speech	—Unverified
DiffStyleTTS: Diffusion-based Hierarchical Prosody Modeling for Text-to-Speech with Diverse and Controllable Styles	Dec 4, 2024	Prosody Predictiontext-to-speech	—Unverified
Text Is Not All You Need: Multimodal Prompting Helps LLMs Understand Humor	Dec 1, 2024	AllNatural Language Understanding	—Unverified
SALMONN-omni: A Codec-free LLM for Full-duplex Speech Understanding and Generation	Nov 27, 2024	Question AnsweringSpeech Enhancement	—Unverified
Continual Learning in Machine Speech Chain Using Gradient Episodic Memory	Nov 27, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Visatronic: A Multimodal Decoder-Only Model for Speech Synthesis	Nov 26, 2024	Decodermultimodal generation	—Unverified
Hard-Synth: Synthesizing Diverse Hard Samples for ASR using Zero-Shot TTS and LLM	Nov 20, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
A Context-Based Numerical Format Prediction for a Text-To-Speech System	Nov 19, 2024	text-to-speechText to Speech	—Unverified
Leveraging Virtual Reality and AI Tutoring for Language Learning: A Case Study of a Virtual Campus Environment with OpenAI GPT Integration with Unity 3D	Nov 19, 2024	Speech-to-Texttext-to-speech	—Unverified
Rethinking MUSHRA: Addressing Modern Challenges in Text-to-Speech Evaluation	Nov 19, 2024	text-to-speechText to Speech	—Unverified
Improving Grapheme-to-Phoneme Conversion through In-Context Knowledge Retrieval with Large Language Models	Nov 12, 2024	Grapheme-to-Phoneme ConversionRetrieval	—Unverified
Debatts: Zero-Shot Debating Text-to-Speech Synthesis	Nov 10, 2024	Speech Synthesistext-to-speech	—Unverified
CUIfy the XR: An Open-Source Package to Embed LLM-powered Conversational Agents in XR	Nov 7, 2024	Language ModellingLarge Language Model	—Unverified
Speech is More Than Words: Do Speech-to-Text Translation Systems Leverage Prosody?	Oct 31, 2024	Rhythmspeech-recognition	—Unverified
Robust and Unbounded Length Generalization in Autoregressive Transformer-Based Text-to-Speech	Oct 29, 2024	Decodertext-to-speech	CodeCode Available
Fast and High-Quality Auto-Regressive Speech Synthesis via Speculative Decoding	Oct 29, 2024	Speech Synthesistext-to-speech	—Unverified
RDSinger: Reference-based Diffusion Network for Singing Voice Synthesis	Oct 29, 2024	DenoisingSinging Voice Synthesis	—Unverified
Asynchronous Tool Usage for Real-Time Agents	Oct 28, 2024	Automatic Speech Recognitionspeech-recognition	—Unverified
Get Large Language Models Ready to Speak: A Late-fusion Approach for Speech Generation	Oct 27, 2024	parameter-efficient fine-tuningQuestion Answering	—Unverified
Making Social Platforms Accessible: Emotion-Aware Speech Generation with Integrated Text Analysis	Oct 24, 2024	Speech Synthesistext-to-speech	—Unverified
Evaluating and Improving Automatic Speech Recognition Systems for Korean Meteorological Experts	Oct 24, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
ELAICHI: Enhancing Low-resource TTS by Addressing Infrequent and Low-frequency Character Bigrams	Oct 23, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Enhancing Low-Resource ASR through Versatile TTS: Bridging the Data Gap	Oct 22, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Continuous Speech Tokenizer in Text To Speech	Oct 22, 2024	Language ModelingLanguage Modelling	CodeCode Available

Show:10 25 50

← PrevPage 18 of 57Next →

No leaderboard results yet.