SOTAVerified

Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Showing 11011150 of 1419 papers

TitleStatusHype
Word-wise intonation model for cross-language TTS systems0
You Do Not Need More Data: Improving End-To-End Speech Recognition by Text-To-Speech Data Augmentation0
Your voice is your voice: Supporting Self-expression through Speech Generation and LLMs in Augmented and Alternative Communication0
Zero-shot Cross-lingual Voice Transfer for TTS0
Zero-Shot Long-Form Voice Cloning with Dynamic Convolution Attention0
Zero-Shot Streaming Text to Speech Synthesis with Transducer and Auto-Regressive Modeling0
Zero-Shot Text-to-Speech as Golden Speech Generator: A Systematic Framework and its Applicability in Automatic Pronunciation Assessment0
Zero Shot Text to Speech Augmentation for Automatic Speech Recognition on Low-Resource Accented Speech Corpora0
Zero-Shot Text-to-Speech for Vietnamese0
Zero-shot text-to-speech synthesis conditioned using self-supervised speech representation model0
Zero-Shot vs. Few-Shot Multi-Speaker TTS Using Pre-trained Czech SpeechT5 Model0
ZET-Speech: Zero-shot adaptive Emotion-controllable Text-to-Speech Synthesis with Diffusion and Style-based Models0
Zipper: A Multi-Tower Decoder Architecture for Fusing Modalities0
Pruning Self-Attention for Zero-Shot Multi-Speaker Text-to-Speech0
Pseudo-Autoregressive Neural Codec Language Models for Efficient Zero-Shot Text-to-Speech Synthesis0
Punjabi Text-To-Speech Synthesis System0
運用Python結合語音辨識及合成技術於自動化音文同步之實作(A Python Implementation of Automatic Speech-text Synchronization Using Speech Recognition and Text-to-Speech Technology)[In Chinese]0
QI-TTS: Questioning Intonation Control for Emotional Speech Synthesis0
RALL-E: Robust Codec Language Modeling with Chain-of-Thought Prompting for Text-to-Speech Synthesis0
Rapid Speaker Adaptation in Low Resource Text to Speech Systems using Synthetic Data and Transfer learning0
RASMALAI: Resources for Adaptive Speech Modeling in Indian Languages with Accents and Intonations0
RDSinger: Reference-based Diffusion Network for Singing Voice Synthesis0
Reading Assistance through LARA, the Learning And Reading Assistant0
Real-Time Pill Identification for the Visually Impaired Using Deep Learning0
ReCAB-VAE: Gumbel-Softmax Variational Inference Based on Analytic Divergence0
Referee: Towards reference-free cross-speaker style transfer with low-quality data for expressive speech synthesis0
Refer-iTTS: A System for Referring in Spoken Installments to Objects in Real-World Images0
Regotron: Regularizing the Tacotron2 architecture via monotonic alignment loss0
Reinforce-Aligner: Reinforcement Alignment Search for Robust End-to-End Text-to-Speech0
Reinforcement Learning for Emotional Text-to-Speech Synthesis with Improved Emotion Discriminability0
DLPO: Diffusion Model Loss-Guided Reinforcement Learning for Fine-Tuning Text-to-Speech Diffusion Models0
Rep2wav: Noise Robust text-to-speech Using self-supervised representations0
Replacing Human Audio with Synthetic Audio for On-device Unspoken Punctuation Prediction0
Representation Selective Self-distillation and wav2vec 2.0 Feature Exploration for Spoof-aware Speaker Verification0
中文轉客文文轉音系統中的客語斷詞處理之研究 (Research on Hakka Word Segmentation Processes in Chinese-to-Hakka Text-to-Speech System )[In Chinese]0
Residual Adapters for Few-Shot Text-to-Speech Speaker Adaptation0
Resource-Efficient Fine-Tuning Strategies for Automatic MOS Prediction in Text-to-Speech for Low-Resource Languages0
Rethinking MUSHRA: Addressing Modern Challenges in Text-to-Speech Evaluation0
Retrieval-Augmented Audio Deepfake Detection0
ReVISE: Self-Supervised Speech Resynthesis with Visual Input for Universal and Generalized Speech Enhancement0
ReVISE: Self-Supervised Speech Resynthesis With Visual Input for Universal and Generalized Speech Regeneration0
Revisiting IPA-based Cross-lingual Text-to-speech0
Revisiting Over-Smoothness in Text to Speech0
Revival with Voice: Multi-modal Controllable Text-to-Speech Synthesis0
r-G2P: Evaluating and Enhancing Robustness of Grapheme to Phoneme Conversion by Controlled noise introducing and Contextual information incorporation0
Rhythm-controllable Attention with High Robustness for Long Sentence Speech Synthesis0
R-MelNet: Reduced Mel-Spectral Modeling for Neural TTS0
Robust Zero-Shot Text-to-Speech Synthesis with Reverse Inference Optimization0
RSS-TOBI - A Prosodically Enhanced Romanian Speech Corpus0
RUSLAN: Russian Spoken Language Corpus for Speech Synthesis0
Show:102550
← PrevPage 23 of 29Next →

No leaderboard results yet.