UMETTS: A Unified Framework for Emotional Text-to-Speech Synthesis with Multimodal Prompts Apr 29, 2024 Contrastive Learning Speech Synthesis
Code Code Available 15 Semi-supervised URL Segmentation with Recurrent Neural Networks Pre-trained on Knowledge Graph Entities Dec 1, 2020 Chinese Word Segmentation Speech Synthesis
Code Code Available 15 MnTTS2: An Open-Source Multi-Speaker Mongolian Text-to-Speech Synthesis Dataset Dec 11, 2022 Speech Synthesis text-to-speech
Code Code Available 15 Audio Jailbreak: An Open Comprehensive Benchmark for Jailbreaking Large Audio-Language Models May 21, 2025 Bayesian Optimization Speech Synthesis
Code Code Available 15 Effective Deep Learning Models for Automatic Diacritization of Arabic Text Nov 1, 2020 Arabic Text Diacritization Decoder
Code Code Available 15 EdiTTS: Score-based Editing for Controllable Text-to-Speech Oct 6, 2021 Speech Synthesis Speech-to-Text
Code Code Available 15 ArTST: Arabic Text and Speech Transformer Oct 25, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 MnTTS: An Open-Source Mongolian Text-to-Speech Synthesis Dataset and Accompanied Baseline Sep 22, 2022 Speech Synthesis text-to-speech
Code Code Available 15 Learning Arousal-Valence Representation from Categorical Emotion Labels of Speech Nov 24, 2023 Dimensionality Reduction Emotion Classification
Code Code Available 15 KazakhTTS: An Open-Source Kazakh Text-to-Speech Synthesis Dataset Apr 17, 2021 Speech Synthesis text-to-speech
Code Code Available 15 WaveGrad: Estimating Gradients for Waveform Generation Sep 2, 2020 Speech Synthesis Text-To-Speech Synthesis
Code Code Available 15 Textless Unit-to-Unit training for Many-to-Many Multilingual Speech-to-Speech Translation Aug 3, 2023 Decoder Quantization
Code Code Available 15 Imaginary Voice: Face-styled Diffusion Model for Text-to-Speech Feb 27, 2023 Speech Synthesis text-to-speech
Code Code Available 15 WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis Jun 17, 2021 Speech Synthesis text-to-speech
Code Code Available 15 Wave-Tacotron: Spectrogram-free end-to-end text-to-speech synthesis Nov 6, 2020 Decoder Speech Synthesis
Code Code Available 15 Automatic Prosody Annotation with Pre-Trained Text-Speech Model Jun 16, 2022 Speech Synthesis text-to-speech
Code Code Available 15 Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis May 12, 2020 Speech Synthesis Style Transfer
Code Code Available 15 Fine-grained style control in Transformer-based Text-to-speech Synthesis Oct 12, 2021 Inductive Bias Speech Synthesis
Code Code Available 15 FastSpeech 2: Fast and High-Quality End-to-End Text to Speech Jun 8, 2020 Knowledge Distillation Speech Synthesis
Code Code Available 15 Exploring Transfer Learning for Low Resource Emotional TTS Jan 14, 2019 Deep Learning Emotional Speech Synthesis
Code Code Available 15 Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech May 13, 2021 Decoder Speech Synthesis
Code Code Available 15 Accented Text-to-Speech Synthesis with a Conditional Variational Autoencoder Nov 7, 2022 Speech Synthesis text-to-speech
Code Code Available 15 In Other News: A Bi-style Text-to-speech Model for Synthesizing Newscaster Voice with Limited Data Apr 4, 2019 Speech Synthesis text-to-speech
Code Code Available 15 Improved Child Text-to-Speech Synthesis through Fastpitch-based Transfer Learning Nov 7, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search May 22, 2020 text-to-speech Text to Speech
Code Code Available 15 YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone Dec 4, 2021 Speech Synthesis Text-To-Speech Synthesis
Code Code Available 15 Towards Lifelong Learning of Multilingual Text-To-Speech Synthesis Oct 9, 2021 Lifelong learning Speech Synthesis
Code Code Available 05 Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis Jun 12, 2018 Speaker Verification Speech Synthesis
Code Code Available 05 The Emotional Voices Database: Towards Controlling the Emotion Dimension in Voice Generation Systems Jun 25, 2018 Speech Emotion Recognition Speech Synthesis
Code Code Available 05 Tools and resources for Romanian text-to-speech and speech-to-text applications Feb 15, 2018 speech-recognition Speech Recognition
Code Code Available 05 Systematic Inequalities in Language Technology Performance across the World's Languages Oct 13, 2021 Dependency Parsing Machine Translation
Code Code Available 05 Comparison of Speech Representations for Automatic Quality Estimation in Multi-Speaker Text-to-Speech Synthesis Feb 28, 2020 Speech Synthesis text-to-speech
Code Code Available 05 Systematic Inequalities in Language Technology Performance across the World’s Languages May 1, 2022 Dependency Parsing Machine Translation
Code Code Available 05 Speech Synthesis from Text and Ultrasound Tongue Image-based Articulatory Input Jul 5, 2021 Speech Synthesis text-to-speech
Code Code Available 05 PriorGrad: Improving Conditional Denoising Diffusion Models with Data-Dependent Adaptive Prior Jun 11, 2021 Audio Generation Denoising
Code Code Available 05 Preparing an Endangered Language for the Digital Age: The Case of Judeo-Spanish May 31, 2022 Machine Translation Speech Synthesis
Code Code Available 05 Attentive Multi-Layer Perceptron for Non-autoregressive Generation Oct 14, 2023 Machine Translation Speech Synthesis
Code Code Available 05 Spoofing Speaker Verification Systems with Deep Multi-speaker Text-to-speech Synthesis Oct 29, 2019 Speaker Verification Speech Synthesis
Code Code Available 05 Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale Jun 23, 2023 In-Context Learning Speech Synthesis
Code Code Available 05 Multimodal Latent Language Modeling with Next-Token Diffusion Dec 11, 2024 Image Generation Language Modeling
Code Code Available 05 Bayesian Parameter-Efficient Fine-Tuning for Overcoming Catastrophic Forgetting Feb 19, 2024 Language Modeling Language Modelling
Code Code Available 05 Non-Autoregressive Neural Text-to-Speech May 21, 2019 text-to-speech Text to Speech
Code Code Available 05 Mlphon: A Multifunctional Grapheme-Phoneme Conversion Tool Using Finite State Transducers Sep 5, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 05 Meta Learning Text-to-Speech Synthesis in over 7000 Languages Jun 10, 2024 Meta-Learning Speech Synthesis
Code Code Available 05 Effective parameter estimation methods for an ExcitNet model in generative text-to-speech systems May 21, 2019 parameter estimation Speech Synthesis
Code Code Available 05 MIA-Prognosis: A Deep Learning Framework to Predict Therapy Response Oct 8, 2020 Deep Learning Prognosis
Code Code Available 05 ECAPA-TDNN for Multi-speaker Text-to-speech Synthesis Mar 20, 2022 Speaker Verification Speech Synthesis
Code Code Available 05 Back Transcription as a Method for Evaluating Robustness of Natural Language Understanding Models to Speech Recognition Errors Oct 25, 2023 en-US domain classification en-US Intent Classification
Code Code Available 05 MelNet: A Generative Model for Audio in the Frequency Domain Jun 4, 2019 Audio Generation Music Generation
Code Code Available 05 Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language Oct 29, 2018 Speech Synthesis text-to-speech
Code Code Available 05