SOTAVerified

Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Showing 12011250 of 1419 papers

TitleStatusHype
GraphTTS: graph-to-sequence modelling in neural text-to-speech0
Comparison of Speech Representations for Automatic Quality Estimation in Multi-Speaker Text-to-Speech SynthesisCode0
Semi-Supervised Neural Architecture SearchCode1
On the Discrepancy between Density Estimation and Sequence Generation0
Fully-hierarchical fine-grained prosody modeling for interpretable speech synthesis0
Generating diverse and natural text-to-speech samples using a quantized fine-grained VAE and auto-regressive prosody prior0
BOFFIN TTS: Few-Shot Speaker Adaptation by Bayesian Optimization0
WaveTTS: Tacotron-based TTS with Joint Time-Frequency Domain Loss0
Improving LPCNet-based Text-to-Speech with Linear Prediction-structured Mixture Density Network0
From Speech-to-Speech Translation to Automatic Dubbing0
Smart Summarizer for Blind People0
Parallel Neural Text-to-Speech0
Generating Synthetic Audio Data for Attention-Based Speech Recognition Systems0
Voice Transformer Network: Sequence-to-Sequence Voice Conversion Using Transformer with Text-to-Speech PretrainingCode1
Singing Synthesis: with a little help from my attention0
Neural Voice Puppetry: Audio-driven Facial ReenactmentCode0
Semantic Mask for Transformer based End-to-End Speech RecognitionCode0
Towards Robust Neural Vocoding for Speech Generation: A Survey0
Dynamic Prosody Generation for Speech Synthesis using Linguistics-Driven Acoustic Embedding Selection0
Using VAEs and Normalizing Flows for One-shot Text-To-Speech Synthesis of Expressive Speech0
Cross-lingual Multi-speaker Text-to-speech Synthesis for Voice Cloning without Using Parallel Corpus for Unseen Speakers0
Prosody Transfer in Neural Text to Speech Using Global Pitch and Loudness Features0
Independent and automatic evaluation of acoustic-to-articulatory inversion modelsCode0
Emotional Voice Conversion using Multitask Learning with Text-to-speechCode0
A unified sequence-to-sequence front-end model for Mandarin text-to-speech synthesis0
Teacher-Student Training for Robust Tacotron-based TTS0
Incremental Text-to-Speech Synthesis with Prefix-to-Prefix Framework0
A System for Diacritizing Four Varieties of Arabic0
Spoofing Speaker Verification Systems with Deep Multi-speaker Text-to-speech SynthesisCode0
Unsupervised pre-training for sequence to sequence speech recognition0
Effect of choice of probability distribution, randomness, and search methods for alignment modeling in sequence-to-sequence text-to-speech synthesis using hard alignment0
Multi-Reference Neural TTS Stylization with Adversarial Cycle Consistency0
Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogramCode2
ESPnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Text-to-Speech ToolkitCode0
Location-Relative Attention Mechanisms For Robust Long-Form Speech SynthesisCode0
G2G: TTS-Driven Pronunciation Learning for Graphemic Hybrid ASR0
The Theory behind Controllable Expressive Speech Synthesis: a Cross-disciplinary Approach0
Semi-Supervised Generative Modeling for Controllable Speech Synthesis0
High Fidelity Speech Synthesis with Adversarial NetworksCode0
Bootstrapping non-parallel voice conversion from speaker-adaptive text-to-speech0
A Comparative Study on Transformer vs RNN in Speech ApplicationsCode0
Modular Meta-Learning with Shrinkage0
Evaluating Long-form Text-to-Speech: Comparing the Ratings of Sentences and Paragraphs0
Neural Network-Based Modeling of Phonetic Durations0
A Large-Scale User Study of an Alexa Prize Chatbot: Effect of TTS Dynamism on Perceived Quality of Social Dialog0
Initial investigation of an encoder-decoder end-to-end TTS framework using marginalization of monotonic hard latent alignments0
Neural Harmonic-plus-Noise Waveform Model with Trainable Maximum Voice Frequency for Text-to-Speech Synthesis0
From Text to Sound: A Preliminary Study on Retrieving Sound Effects to Radio Stories0
Numbers Normalisation in the Inflected Languages: a Case Study of PolishCode0
MaSS: A Large and Clean Multilingual Corpus of Sentence-aligned Spoken Utterances Extracted from the BibleCode0
Show:102550
← PrevPage 25 of 29Next →

No leaderboard results yet.