SOTAVerified|Agents Browse Leaderboard About Blog

Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 401–425 of 1419 papers

Title	Date	Tasks	Status
MARS6: A Small and Robust Hierarchical-Codec Text-to-Speech Model	Jan 10, 2025	DecoderLanguage Modelling	—Unverified
Probing Speaker-specific Features in Speaker Representations	Jan 9, 2025	Self-Supervised LearningSpeaker Verification	—Unverified
Cued Speech Generation Leveraging a Pre-trained Audiovisual Text-to-Speech Model	Jan 8, 2025	text-to-speechText to Speech	—Unverified
Disambiguation of Chinese Polyphones in an End-to-End Framework with Semantic Features Extracted by Pre-trained BERT	Jan 2, 2025	Polyphone disambiguationSentence	—Unverified
FaceSpeak: Expressive and High-Quality Speech Synthesis from Human Portraits of Different Styles	Jan 2, 2025	Speech Synthesistext-to-speech	—Unverified
Stable-TTS: Stable Speaker-Adaptive Text-to-Speech Synthesis via Prosody Prompting	Dec 28, 2024	Speech Synthesistext-to-speech	—Unverified
Indonesian-English Code-Switching Speech Synthesizer Utilizing Multilingual STEN-TTS and Bert LID	Dec 26, 2024	Language Identificationtext-to-speech	—Unverified
"I've Heard of You!": Generate Spoken Named Entity Recognition Data for Unseen Entities	Dec 26, 2024	Domain AdaptationLanguage Modeling	CodeCode Available
Advancing NAM-to-Speech Conversion with Novel Methods and the MultiNAM Dataset	Dec 25, 2024	text-to-speechText to Speech	—Unverified
Incremental Disentanglement for Environment-Aware Zero-Shot Text-to-Speech Synthesis	Dec 22, 2024	DecoderDisentanglement	—Unverified
Autoregressive Speech Synthesis with Next-Distribution Prediction	Dec 22, 2024	Language ModelingLanguage Modelling	—Unverified
Why Do Speech Language Models Fail to Generate Semantically Coherent Outputs? A Modality Evolving Perspective	Dec 22, 2024	text-to-speechText to Speech	—Unverified
Interleaved Speech-Text Language Models are Simple Streaming Text to Speech Synthesizers	Dec 20, 2024	Language ModelingLanguage Modelling	—Unverified
Scale This, Not That: Investigating Key Dataset Attributes for Efficient Speech Enhancement Scaling	Dec 19, 2024	AttributeSpeech Enhancement	—Unverified
Enhancing Naturalness in LLM-Generated Utterances through Disfluency Insertion	Dec 17, 2024	text-to-speechText to Speech	—Unverified
Phoneme-Level Feature Discrepancies: A Key to Detecting Sophisticated Speech Deepfakes	Dec 17, 2024	DeepFake DetectionFace Swapping	—Unverified
ProsodyFM: Unsupervised Phrasing and Intonation Control for Intelligible Speech Synthesis	Dec 16, 2024	Speech Synthesistext-to-speech	—Unverified
Multi-modal and Multi-scale Spatial Environment Understanding for Immersive Visual Text-to-Speech	Dec 16, 2024	text-to-speechText to Speech	CodeCode Available
Efficient Generative Modeling with Residual Vector Quantization-Based Tokens	Dec 13, 2024	Conditional Image GenerationImage Generation	—Unverified
AMuSeD: An Attentive Deep Neural Network for Multimodal Sarcasm Detection Incorporating Bi-modal Data Augmentation	Dec 13, 2024	Data AugmentationSarcasm Detection	—Unverified
CSSinger: End-to-End Chunkwise Streaming Singing Voice Synthesis System Based on Conditional Variational Autoencoder	Dec 12, 2024	Audio SynthesisSinging Voice Synthesis	—Unverified
A Preliminary Analysis of Automatic Word and Syllable Prominence Detection in Non-Native Speech With Text-to-Speech Prosody Embeddings	Dec 11, 2024	text-to-speechText to Speech	—Unverified
A Unified Model For Voice and Accent Conversion In Speech and Singing using Self-Supervised Learning and Feature Extraction	Dec 11, 2024	DecoderSelf-Supervised Learning	—Unverified
LatentSpeech: Latent Diffusion for Text-To-Speech Generation	Dec 11, 2024	text-to-speechText to Speech	—Unverified
Multimodal Latent Language Modeling with Next-Token Diffusion	Dec 11, 2024	Image GenerationLanguage Modeling	CodeCode Available

Show:10 25 50

← PrevPage 17 of 57Next →

No leaderboard results yet.