SOTAVerified

Text to Speech

import gTTS import os def text_to_speech_kurdish(text, output_file="output.mp3"): # گۆڕینی نووسین بۆ دەنگ بە زمانی کوردی (هەڵبژاردنی زمانی "ku" بۆ کوردی) tts = gTTS(text=text, lang='ku', slow=False) tts.save(output_file) os.system(f"start {output_file}") # کردنەوەی فایلە دەنگییەکە (لە Windows) # نموونە: text_to_speech_kurdish("سڵاو، ئەمە دەنگی منە بە زمانی کوردی.")

Papers

Showing 9511000 of 1419 papers

TitleStatusHype
Environment Aware Text-to-Speech Synthesis0
A study on the efficacy of model pre-training in developing neural text-to-speech system0
Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised Training in Text-To-SpeechCode1
VisualTTS: TTS with Accurate Lip-Speech Synchronization for Automatic Voice Over0
Applying Phonological Features in Multilingual Text-To-SpeechCode0
Mixer-TTS: non-autoregressive, fast and compact text-to-speech model conditioned on language model embeddingsCode1
Emphasis control for parallel neural TTS0
GANtron: Emotional Speech Synthesis with Generative Adversarial Networks0
Hierarchical prosody modeling and control in non-autoregressive parallel neural TTS0
EdiTTS: Score-based Editing for Controllable Text-to-SpeechCode1
Prosody-TTS: An end-to-end speech synthesis system with prosody control0
Style Equalization: Unsupervised Learning of Controllable Generative Sequence Models0
On the Interplay Between Sparsity, Naturalness, Intelligibility, and Prosody in Speech Synthesis0
Neural Speech Synthesis in German0
Incorporating speaker embedding and post-filter network for improving speaker similarity of personalized speech synthesis system0
PortaSpeech: Portable and High-Quality Generative Text-to-SpeechCode2
Conditioning Sequence-to-sequence Networks with Learned Activations0
Guided-TTS:Text-to-Speech with Untranscribed Speech0
FlowVocoder: A small Footprint Neural Vocoder based Normalizing flow for Speech Synthesis0
A Proposal of Automatic Error Correction in Text0
Low-Latency Incremental Text-to-Speech Synthesis with Distilled Context Prediction Network0
On-device neural speech synthesis0
fairseq S^2: A Scalable and Integrable Speech Synthesis ToolkitCode0
Zero-Shot Text-to-Speech for Text-Based Insertion in Audio NarrationCode1
Referee: Towards reference-free cross-speaker style transfer with low-quality data for expressive speech synthesis0
Integrated Speech and Gesture SynthesisCode0
A Unified Transformer-based Framework for Duplex Text Normalization0
Fighting Game Commentator with Pitch and Loudness Adjustment Utilizing Highlight Cues0
GC-TTS: Few-shot Speaker Adaptation with Geometric Constraints0
Enhancing audio quality for expressive Neural Text-to-Speech0
RW-Resnet: A Novel Speech Anti-Spoofing Model Using Raw Waveform0
AnyoneNet: Synchronized Speech and Talking Head Generation for Arbitrary Person0
A Speech-enabled Fixed-phrase Translator for Healthcare Accessibility0
BTS: Back TranScription for Speech-to-Text Post-Processor using Text-to-Speech-to-Text0
A Survey on Audio Synthesis and Audio-Visual Multimodal Processing0
Cross-speaker Style Transfer with Prosody Bottleneck in Neural Speech Synthesis0
UR Channel-Robust Synthetic Speech Detection System for ASVspoof 2021Code1
Adaptation of Tacotron2-based Text-To-Speech for Articulatory-to-Acoustic Mapping using Ultrasound Tongue ImagingCode0
StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice ConversionCode1
Digital Einstein Experience: Fast Text-to-Speech for Conversational AI0
On Prosody Modeling for ASR+TTS based Voice Conversion0
Extending Text-to-Speech Synthesis with Articulatory Movement Prediction using Ultrasound Tongue ImagingCode0
Federated Learning with Dynamic Transformer for Text to Speech0
SoundStream: An End-to-End Neural Audio CodecCode3
AdaSpeech 3: Adaptive Text to Speech for Spontaneous Style0
Location, Location: Enhancing the Evaluation of Text-to-Speech Synthesis Using the Rapid Prosody Transcription Paradigm0
Speech Synthesis from Text and Ultrasound Tongue Image-based Articulatory InputCode0
EditSpeech: A Text Based Speech Editing System Using Partial Inference and Bidirectional FusionCode1
Hierarchical Context-Aware Transformers for Non-Autoregressive Text to Speech0
Multi-Scale Spectrogram Modelling for Neural Text-to-Speech0
Show:102550
← PrevPage 20 of 29Next →

No leaderboard results yet.