FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis Apr 21, 2022 Denoising GPU
Code Code Available 2The PartialSpoof Database and Countermeasures for the Detection of Short Fake Speech Segments Embedded in an Utterance Apr 11, 2022 Speaker Verification Speech Synthesis
— Unverified 0SOMOS: The Samsung Open MOS Dataset for the Evaluation of Neural Text-to-Speech Synthesis Apr 6, 2022 Speech Synthesis text-to-speech
— Unverified 0VQTTS: High-Fidelity Text-to-Speech Synthesis with Self-Supervised VQ Acoustic Feature Apr 2, 2022 Speech Synthesis text-to-speech
— Unverified 0Unsupervised Text-to-Speech Synthesis by Unsupervised Automatic Speech Recognition Mar 29, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1Applying Syntaxx2013Prosody Mapping Hypothesis and Prosodic Well-Formedness Constraints to Neural Sequence-to-Sequence Speech Synthesis Mar 29, 2022 Speech Synthesis text-to-speech
— Unverified 0AutoTTS: End-to-End Text-to-Speech Synthesis through Differentiable Duration Modeling Mar 21, 2022 Decoder Speech Synthesis
— Unverified 0ECAPA-TDNN for Multi-speaker Text-to-speech Synthesis Mar 20, 2022 Speaker Verification Speech Synthesis
Code Code Available 0Text-free non-parallel many-to-many voice conversion using normalising flows Mar 15, 2022 Normalising Flows Speech Synthesis
— Unverified 0iSTFTNet: Fast and Lightweight Mel-Spectrogram Vocoder Incorporating Inverse Short-Time Fourier Transform Mar 4, 2022 Speech Synthesis text-to-speech
Code Code Available 2Generative Modeling for Low Dimensional Speech Attributes with Neural Spline Flows Mar 3, 2022 Speech Synthesis text-to-speech
Code Code Available 2Deep Performer: Score-to-Audio Music Performance Synthesis Feb 12, 2022 Decoder Speech Synthesis
— Unverified 0Multi-Stage Deep Transfer Learning for EmIoT-enabled Human-Computer Interaction Feb 3, 2022 Human-Object Interaction Detection text-to-speech
— Unverified 0Transformer-based Models of Text Normalization for Speech Applications Feb 1, 2022 Sentence Speech Synthesis
— Unverified 0Multi-speaker Multi-style Text-to-speech Synthesis With Single-speaker Single-style Training Data Scenarios Dec 23, 2021 Diversity Speech Synthesis
— Unverified 0Multi-Singer: Fast Multi-Singer Singing Voice Vocoder With A Large-Scale Corpus Dec 20, 2021 Audio Generation Singing Voice Synthesis
Code Code Available 1YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone Dec 4, 2021 Speech Synthesis Text-To-Speech Synthesis
Code Code Available 1Guided-TTS: A Diffusion Model for Text-to-Speech via Classifier Guidance Nov 23, 2021 speech-recognition Speech Recognition
— Unverified 0Systematic Inequalities in Language Technology Performance across the World's Languages Oct 13, 2021 Dependency Parsing Machine Translation
Code Code Available 0Fine-grained style control in Transformer-based Text-to-speech Synthesis Oct 12, 2021 Inductive Bias Speech Synthesis
Code Code Available 1Towards Lifelong Learning of Multilingual Text-To-Speech Synthesis Oct 9, 2021 Lifelong learning Speech Synthesis
Code Code Available 0Environment Aware Text-to-Speech Synthesis Oct 8, 2021 Attribute Disentanglement
— Unverified 0EdiTTS: Score-based Editing for Controllable Text-to-Speech Oct 6, 2021 Speech Synthesis Speech-to-Text
Code Code Available 1Prosody-TTS: An end-to-end speech synthesis system with prosody control Oct 6, 2021 Rhythm Speech Synthesis
— Unverified 0Neural Speech Synthesis in German Oct 3, 2021 Speech Synthesis text-to-speech
— Unverified 0PortaSpeech: Portable and High-Quality Generative Text-to-Speech Sep 30, 2021 text-to-speech Text to Speech
Code Code Available 2Conditioning Sequence-to-sequence Networks with Learned Activations Sep 29, 2021 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Guided-TTS:Text-to-Speech with Untranscribed Speech Sep 29, 2021 Speech Synthesis text-to-speech
— Unverified 0Low-Latency Incremental Text-to-Speech Synthesis with Distilled Context Prediction Network Sep 22, 2021 Knowledge Distillation Language Modeling
— Unverified 0A Unified Transformer-based Framework for Duplex Text Normalization Aug 23, 2021 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Extending Text-to-Speech Synthesis with Articulatory Movement Prediction using Ultrasound Tongue Imaging Jul 12, 2021 Prediction Speech Synthesis
Code Code Available 0Location, Location: Enhancing the Evaluation of Text-to-Speech Synthesis Using the Rapid Prosody Transcription Paradigm Jul 6, 2021 Speech Synthesis text-to-speech
— Unverified 0Speech Synthesis from Text and Ultrasound Tongue Image-based Articulatory Input Jul 5, 2021 Speech Synthesis text-to-speech
Code Code Available 0WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis Jun 17, 2021 Speech Synthesis text-to-speech
Code Code Available 1RyanSpeech: A Corpus for Conversational Text-to-Speech Synthesis Jun 15, 2021 speech-recognition Speech Recognition
Code Code Available 1PriorGrad: Improving Conditional Denoising Diffusion Models with Data-Dependent Adaptive Prior Jun 11, 2021 Audio Generation Denoising
Code Code Available 0Enhancing Speaking Styles in Conversational Text-to-Speech Synthesis with Graph-based Multi-modal Context Modeling Jun 11, 2021 Speech Synthesis text-to-speech
Code Code Available 1An objective evaluation of the effects of recording conditions and speaker characteristics in multi-speaker deep neural speech synthesis Jun 3, 2021 Speaker Verification Speech Synthesis
— Unverified 0Speaker verification-derived loss and data augmentation for DNN-based multispeaker speech synthesis Jun 3, 2021 Data Augmentation Speaker Verification
— Unverified 0RAD-TTS: Parallel Flow-Based TTS with Robust Alignment Learning and Diverse Synthesis Jun 2, 2021 Diversity Rhythm
Code Code Available 1Dual Script E2E framework for Multilingual and Code-Switching ASR Jun 2, 2021 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech May 13, 2021 Decoder Speech Synthesis
Code Code Available 1DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism May 6, 2021 Generative Adversarial Network Singing Voice Synthesis
Code Code Available 2Phrase break prediction with bidirectional encoder representations in Japanese text-to-speech synthesis Apr 26, 2021 Language Modeling Language Modelling
Code Code Available 0KazakhTTS: An Open-Source Kazakh Text-to-Speech Synthesis Dataset Apr 17, 2021 Speech Synthesis text-to-speech
Code Code Available 1Enhancing Word-Level Semantic Representation via Dependency Structure for Expressive Text-to-Speech Synthesis Apr 14, 2021 Dependency Parsing Representation Learning
— Unverified 0Flavored Tacotron: Conditional Learning for Prosodic-linguistic Features Apr 8, 2021 Decoder Speech Synthesis
— Unverified 0Reinforcement Learning for Emotional Text-to-Speech Synthesis with Improved Emotion Discriminability Apr 3, 2021 Emotion Recognition reinforcement-learning
— Unverified 0PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS Mar 28, 2021 Representation Learning Text-To-Speech Synthesis
— Unverified 0Continual Speaker Adaptation for Text-to-Speech Synthesis Mar 26, 2021 Continual Learning Diversity
— Unverified 0