SOTAVerified

Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Showing 851900 of 1419 papers

TitleStatusHype
A Taxonomy of Specific Problem Classes in Text-to-Speech Synthesis: Comparing Commercial and Open Source Performance0
A Text Normalisation System for Non-Standard English Words0
A Text-to-Speech Pipeline, Evaluation Methodology, and Initial Fine-Tuning Results for Child Speech Synthesis0
A Text to Speech (TTS) System with English to Punjabi Conversion0
A Transfer Learning End-to-End ArabicText-To-Speech (TTS) Deep Architecture0
Attempt Towards Stress Transfer in Speech-to-Speech Machine Translation0
Attention-Constrained Inference for Robust Decoder-Only Text-to-Speech0
AttentionStitch: How Attention Solves the Speech Editing Problem0
AttS2S-VC: Sequence-to-Sequence Voice Conversion with Attention and Context Preservation Mechanisms0
Audiobook Dialogues as Training Data for Conversational Style Synthetic Voices0
Audio-conditioned phonemic and prosodic annotation for building text-to-speech models from unlabeled speech data0
Audio Deep Fake Detection System with Neural Stitching for ADD 20220
A Survey on Audio Diffusion Models: Text To Speech Synthesis and Enhancement in Generative AI0
AudioJailbreak: Jailbreak Attacks against End-to-End Large Audio-Language Models0
AudioVisual Speech Synthesis: A brief literature review0
Augmentation through Laundering Attacks for Audio Spoof Detection0
Augmenting Images for ASR and TTS through Single-loop and Dual-loop Multimodal Chain Framework0
Augmenting text for spoken language understanding with Large Language Models0
A Unified Framework for Collecting Text-to-Speech Synthesis Datasets for 22 Indian Languages0
A unified front-end framework for English text-to-speech synthesis0
A Unified Model For Voice and Accent Conversion In Speech and Singing using Self-Supervised Learning and Feature Extraction0
A unified sequence-to-sequence front-end model for Mandarin text-to-speech synthesis0
A Unified Transformer-based Framework for Duplex Text Normalization0
Automatic Arabic Dialect Identification Systems for Written Texts: A Survey0
Automatic Evaluation of Speaker Similarity0
Automatic Evaluation of Turn-taking Cues in Conversational Speech Synthesis0
Automatic Heteronym Resolution Pipeline Using RAD-TTS Aligners0
Automatic Speech Recognition for Hindi0
AutoMOS: Learning a non-intrusive assessor of naturalness-of-speech0
Autoregressive Diffusion Transformer for Text-to-Speech Synthesis0
Autoregressive Speech Synthesis with Next-Distribution Prediction0
Autoregressive Speech Synthesis without Vector Quantization0
Auto Spell Suggestion for High Quality Speech Synthesis in Hindi0
AutoStyle-TTS: Retrieval-Augmented Generation based Automatic Style Matching Text-to-Speech Synthesis0
A Virtual Simulation-Pilot Agent for Training of Air Traffic Controllers0
Back-Translation-Style Data Augmentation for Mandarin Chinese Polyphone Disambiguation0
Bahasa Harmony: A Comprehensive Dataset for Bahasa Text-to-Speech Synthesis with Discrete Codec Modeling of EnGen-TTS0
Balancing Speech Understanding and Generation Using Continual Pre-training for Codec-based Speech LLM0
BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data0
LAraBench: Benchmarking Arabic AI with Large Language Models0
Benchmarking Expressive Japanese Character Text-to-Speech with VITS and Style-BERT-VITS20
BERT, can HE predict contrastive focus? Predicting and controlling prominence in neural TTS using a language model0
Beyond Text-to-Text: An Overview of Multimodal and Generative Artificial Intelligence for Education Using Topic Modeling0
BitTTS: Highly Compact Text-to-Speech Using 1.58-bit Quantization and Weight Indexing0
BiVocoder: A Bidirectional Neural Vocoder Integrating Feature Extraction and Waveform Generation0
BOFFIN TTS: Few-Shot Speaker Adaptation by Bayesian Optimization0
Boosting Diffusion Model for Spectrogram Up-sampling in Text-to-speech: An Empirical Study0
Boosting Large Language Model for Speech Synthesis: An Empirical Study0
Bootstrap an end-to-end ASR system by multilingual training, transfer learning, text-to-text mapping and synthetic audio0
Bootstrapping non-parallel voice conversion from speaker-adaptive text-to-speech0
Show:102550
← PrevPage 18 of 29Next →

No leaderboard results yet.