MParrotTTS: Multilingual Multi-speaker Text to Speech Synthesis in Low Resource Setting May 19, 2023 Speech Synthesis text-to-speech
— Unverified 0A unified front-end framework for English text-to-speech synthesis May 18, 2023 Speech Synthesis Text Normalization
— Unverified 0Accented Text-to-Speech Synthesis with Limited Data May 8, 2023 Speech Synthesis text-to-speech
— Unverified 0M2-CTTS: End-to-End Multi-scale Multi-modal Conversational Text-to-Speech Synthesis May 3, 2023 Speech Synthesis text-to-speech
— Unverified 0A Review of Deep Learning Techniques for Speech Processing Apr 30, 2023 Automatic Speech Recognition Deep Learning
— Unverified 0Zero-shot text-to-speech synthesis conditioned using self-supervised speech representation model Apr 24, 2023 Rhythm Self-Supervised Learning
— Unverified 0Enhancing Suno's Bark Text-to-Speech Model: Addressing Limitations Through Meta's Encodec and Pre-Trained Hubert Apr 18, 2023 Audio Generation Expressive Speech Synthesis
Code Code Available 4Text is All You Need: Personalizing ASR Models using Controllable Speech Synthesis Mar 27, 2023 All Automatic Speech Recognition
— Unverified 0A Survey on Audio Diffusion Models: Text To Speech Synthesis and Enhancement in Generative AI Mar 23, 2023 Speech Enhancement Speech Synthesis
— Unverified 0Controllable Prosody Generation With Partial Inputs Mar 14, 2023 Speech Synthesis text-to-speech
— Unverified 0Do Prosody Transfer Models Transfer Prosody? Mar 7, 2023 Speech Synthesis text-to-speech
— Unverified 0Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language Modeling Mar 7, 2023 In-Context Learning Language Modeling
Code Code Available 5ParrotTTS: Text-to-Speech synthesis by exploiting self-supervised representations Mar 1, 2023 Self-Supervised Learning Speech Synthesis
— Unverified 0Imaginary Voice: Face-styled Diffusion Model for Text-to-Speech Feb 27, 2023 Speech Synthesis text-to-speech
Code Code Available 1A Vector Quantized Approach for Text to Speech Synthesis on Real-World Spontaneous Speech Feb 8, 2023 Code Generation Diversity
Code Code Available 2UzbekTagger: The rule-based POS tagger for Uzbek language Jan 30, 2023 Language Modeling Language Modelling
— Unverified 0Applying Automated Machine Translation to Educational Video Courses Jan 9, 2023 Machine Translation Speech Synthesis
— Unverified 0Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers Jan 5, 2023 In-Context Learning Language Modeling
Code Code Available 7ReVISE: Self-Supervised Speech Resynthesis With Visual Input for Universal and Generalized Speech Regeneration Jan 1, 2023 Audio-Visual Speech Recognition Resynthesis
— Unverified 0ReVISE: Self-Supervised Speech Resynthesis with Visual Input for Universal and Generalized Speech Enhancement Dec 21, 2022 Audio-Visual Speech Recognition Resynthesis
— Unverified 0Text-to-speech synthesis based on latent variable conversion using diffusion probabilistic model and variational autoencoder Dec 16, 2022 Representation Learning Speech Synthesis
— Unverified 0Investigation of Japanese PnG BERT language model in text-to-speech synthesis for pitch accent language Dec 16, 2022 Language Modeling Language Modelling
— Unverified 0RWEN-TTS: Relation-aware Word Encoding Network for Natural Text-to-Speech Synthesis Dec 15, 2022 Relation Speech Synthesis
Code Code Available 1MnTTS2: An Open-Source Multi-Speaker Mongolian Text-to-Speech Synthesis Dataset Dec 11, 2022 Speech Synthesis text-to-speech
Code Code Available 1Towards Building Text-To-Speech Systems for the Next Billion Users Nov 17, 2022 Diversity Speech Synthesis
Code Code Available 2Grad-StyleSpeech: Any-speaker Adaptive Text-to-Speech Synthesis with Diffusion Models Nov 17, 2022 Speech Synthesis text-to-speech
— Unverified 0OverFlow: Putting flows on top of neural transducers for better TTS Nov 13, 2022 Normalising Flows Speech Synthesis
Code Code Available 1ERNIE-SAT: Speech and Text Joint Pretraining for Cross-Lingual Multi-Speaker Text-to-Speech Nov 7, 2022 Representation Learning Speech Representation Learning
Code Code Available 6Accented Text-to-Speech Synthesis with a Conditional Variational Autoencoder Nov 7, 2022 Speech Synthesis text-to-speech
Code Code Available 1Technology Pipeline for Large Scale Cross-Lingual Dubbing of Lecture Videos into Multiple Indian Languages Nov 1, 2022 Chunking Rhythm
— Unverified 0Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech Oct 27, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0An Overview of Affective Speech Synthesis and Conversion in the Deep Learning Era Oct 6, 2022 Speech Synthesis text-to-speech
— Unverified 0Controllable Accented Text-to-Speech Synthesis Sep 22, 2022 Speech Synthesis text-to-speech
— Unverified 0MnTTS: An Open-Source Mongolian Text-to-Speech Synthesis Dataset and Accompanied Baseline Sep 22, 2022 Speech Synthesis text-to-speech
Code Code Available 1EPIC TTS Models: Empirical Pruning Investigations Characterizing Text-To-Speech Models Sep 22, 2022 Speech Synthesis text-to-speech
— Unverified 0Mlphon: A Multifunctional Grapheme-Phoneme Conversion Tool Using Finite State Transducers Sep 5, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 0ProDiff: Progressive Fast Diffusion Model For High-Quality Text-to-Speech Jul 13, 2022 Denoising GPU
Code Code Available 3BERT, can HE predict contrastive focus? Predicting and controlling prominence in neural TTS using a language model Jul 4, 2022 Language Modeling Language Modelling
— Unverified 0R-MelNet: Reduced Mel-Spectral Modeling for Neural TTS Jun 30, 2022 Decoder GPU
— Unverified 0Automatic Prosody Annotation with Pre-Trained Text-Speech Model Jun 16, 2022 Speech Synthesis text-to-speech
Code Code Available 1BU-TTS: An Open-Source, Bilingual Welsh-English, Text-to-Speech Corpus Jun 1, 2022 Speech Synthesis text-to-speech
— Unverified 0Exploring Transfer Learning for Urdu Speech Synthesis Jun 1, 2022 Speech Synthesis text-to-speech
— Unverified 0Investigating Inter- and Intra-speaker Voice Conversion using Audiobooks Jun 1, 2022 Speech Synthesis text-to-speech
— Unverified 0Preparing an Endangered Language for the Digital Age: The Case of Judeo-Spanish May 31, 2022 Machine Translation Speech Synthesis
Code Code Available 0StyleTTS: A Style-Based Generative Model for Natural and Diverse Text-to-Speech Synthesis May 30, 2022 Data Augmentation Self-Supervised Learning
Code Code Available 2PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit May 20, 2022 All Automatic Speech Recognition (ASR)
Code Code Available 6GenerSpeech: Towards Style Transfer for Generalizable Out-Of-Domain Text-to-Speech May 15, 2022 Speech Synthesis Style Transfer
Code Code Available 2ReCAB-VAE: Gumbel-Softmax Variational Inference Based on Analytic Divergence May 9, 2022 Speech Synthesis text-to-speech
— Unverified 0NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality May 9, 2022 Sentence Speech Synthesis
Code Code Available 2Systematic Inequalities in Language Technology Performance across the World’s Languages May 1, 2022 Dependency Parsing Machine Translation
Code Code Available 0