SOTAVerified

Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Showing 10511100 of 1419 papers

TitleStatusHype
A Unified Transformer-based Framework for Duplex Text Normalization0
Fighting Game Commentator with Pitch and Loudness Adjustment Utilizing Highlight Cues0
GC-TTS: Few-shot Speaker Adaptation with Geometric Constraints0
Enhancing audio quality for expressive Neural Text-to-Speech0
RW-Resnet: A Novel Speech Anti-Spoofing Model Using Raw Waveform0
AnyoneNet: Synchronized Speech and Talking Head Generation for Arbitrary Person0
BTS: Back TranScription for Speech-to-Text Post-Processor using Text-to-Speech-to-Text0
A Speech-enabled Fixed-phrase Translator for Healthcare Accessibility0
A Survey on Audio Synthesis and Audio-Visual Multimodal Processing0
Cross-speaker Style Transfer with Prosody Bottleneck in Neural Speech Synthesis0
Adaptation of Tacotron2-based Text-To-Speech for Articulatory-to-Acoustic Mapping using Ultrasound Tongue ImagingCode0
Digital Einstein Experience: Fast Text-to-Speech for Conversational AI0
On Prosody Modeling for ASR+TTS based Voice Conversion0
Extending Text-to-Speech Synthesis with Articulatory Movement Prediction using Ultrasound Tongue ImagingCode0
Federated Learning with Dynamic Transformer for Text to Speech0
Location, Location: Enhancing the Evaluation of Text-to-Speech Synthesis Using the Rapid Prosody Transcription Paradigm0
AdaSpeech 3: Adaptive Text to Speech for Spontaneous Style0
Speech Synthesis from Text and Ultrasound Tongue Image-based Articulatory InputCode0
GANSpeech: Adversarial Training for High-Fidelity Multi-Speaker Speech Synthesis0
Multi-Scale Spectrogram Modelling for Neural Text-to-Speech0
Hierarchical Context-Aware Transformers for Non-Autoregressive Text to Speech0
Non-Autoregressive TTS with Explicit Duration Modelling for Low-Resource Highly Expressive Speech0
Non-native English lexicon creation for bilingual speech synthesis0
Advances in Speech Vocoding for Text-to-Speech with Continuous Parameters0
EMOVIE: A Mandarin Emotion Speech Dataset with a Simple Emotional Text-to-Speech Model0
Improving the expressiveness of neural vocoding with non-affine Normalizing Flows0
ADEPT: A Dataset for Evaluating Prosody Transfer0
Ctrl-P: Temporal Control of Prosodic Variation for Speech Synthesis0
A learned conditional prior for the VAE acoustic space of a TTS system0
SynthASR: Unlocking Synthetic Data for Speech Recognition0
Improving multi-speaker TTS prosody variance with a residual encoder and normalizing flows0
Speech BERT Embedding For Improving Prosody in Neural TTS0
Data Augmentation Methods for End-to-end Speech Recognition on Distant-Talk Scenarios0
Reinforce-Aligner: Reinforcement Alignment Search for Robust End-to-End Text-to-Speech0
Speaker verification-derived loss and data augmentation for DNN-based multispeaker speech synthesis0
An objective evaluation of the effects of recording conditions and speaker characteristics in multi-speaker deep neural speech synthesis0
Dual Script E2E framework for Multilingual and Code-Switching ASR0
A Corpus of Neutral Voice Speech in Brazilian Portuguese0
Learning Robust Latent Representations for Controllable Speech Synthesis0
Talrómur: A large Icelandic TTS corpus0
On Addressing Practical Challenges for RNN-Transducer0
Phrase break prediction with bidirectional encoder representations in Japanese text-to-speech synthesisCode0
Non-autoregressive sequence-to-sequence voice conversion0
Enhancing Word-Level Semantic Representation via Dependency Structure for Expressive Text-to-Speech Synthesis0
Comparing the Benefit of Synthetic Training Data for Various Automatic Speech Recognition Architectures0
Exploring Machine Speech Chain for Domain Adaptation and Few-Shot Speaker Adaptation0
Flavored Tacotron: Conditional Learning for Prosodic-linguistic Features0
Grapheme-to-Phoneme Transformer Model for Transfer Learning Dialects0
AI4D -- African Language ProgramCode0
Reinforcement Learning for Emotional Text-to-Speech Synthesis with Improved Emotion Discriminability0
Show:102550
← PrevPage 22 of 29Next →

No leaderboard results yet.