On granularity of prosodic representations in expressive text-to-speech Jan 26, 2023 Expressive Speech Synthesis Speech Synthesis
— Unverified 0Multilingual Multiaccented Multispeaker TTS with RADTTS Jan 24, 2023 Speech Synthesis
— Unverified 0Regeneration Learning: A Learning Paradigm for Data Generation Jan 21, 2023 Image Generation Representation Learning
— Unverified 0Applying Automated Machine Translation to Educational Video Courses Jan 9, 2023 Machine Translation Speech Synthesis
— Unverified 0Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers Jan 5, 2023 In-Context Learning Language Modeling
Code Code Available 7Towards Voice Reconstruction from EEG during Imagined Speech Jan 2, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1ReVISE: Self-Supervised Speech Resynthesis With Visual Input for Universal and Generalized Speech Regeneration Jan 1, 2023 Audio-Visual Speech Recognition Resynthesis
— Unverified 0HMM-based data augmentation for E2E systems for building conversational speech synthesis systems Dec 22, 2022 Data Augmentation Language Modeling
— Unverified 0ReVISE: Self-Supervised Speech Resynthesis with Visual Input for Universal and Generalized Speech Enhancement Dec 21, 2022 Audio-Visual Speech Recognition Resynthesis
— Unverified 0Investigation of Japanese PnG BERT language model in text-to-speech synthesis for pitch accent language Dec 16, 2022 Language Modeling Language Modelling
— Unverified 0Text-to-speech synthesis based on latent variable conversion using diffusion probabilistic model and variational autoencoder Dec 16, 2022 Representation Learning Speech Synthesis
— Unverified 0RWEN-TTS: Relation-aware Word Encoding Network for Natural Text-to-Speech Synthesis Dec 15, 2022 Relation Speech Synthesis
Code Code Available 1Style-Label-Free: Cross-Speaker Style Transfer by Quantized VAE and Speaker-wise Normalization in Speech Synthesis Dec 13, 2022 Data Augmentation Speech Synthesis
— Unverified 0MnTTS2: An Open-Source Multi-Speaker Mongolian Text-to-Speech Synthesis Dataset Dec 11, 2022 Speech Synthesis text-to-speech
Code Code Available 1VideoDubber: Machine Translation with Speech-Aware Length Control for Video Dubbing Nov 30, 2022 Machine Translation Sentence
— Unverified 0SNAC: Speaker-normalized affine coupling layer in flow-based architecture for zero-shot multi-speaker text-to-speech Nov 30, 2022 Speech Synthesis text-to-speech
— Unverified 0Controllable speech synthesis by learning discrete phoneme-level prosodic representations Nov 29, 2022 Clustering Speech Synthesis
— Unverified 0Contextual Expressive Text-to-Speech Nov 26, 2022 Speech Synthesis text-to-speech
— Unverified 0Efficient Incremental Text-to-Speech on GPUs Nov 25, 2022 GPU Speech Synthesis
— Unverified 0PromptTTS: Controllable Text-to-Speech with Text Descriptions Nov 22, 2022 Decoder Speech Synthesis
Code Code Available 0Embedding a Differentiable Mel-cepstral Synthesis Filter to a Neural Speech Synthesis System Nov 21, 2022 GPU Speech Synthesis
Code Code Available 1LA-VocE: Low-SNR Audio-visual Speech Enhancement using Neural Vocoders Nov 20, 2022 Speech Enhancement Speech Synthesis
— Unverified 0Multi-Speaker Expressive Speech Synthesis via Multiple Factors Decoupling Nov 19, 2022 Expressive Speech Synthesis Speech Synthesis
— Unverified 0Audio Anti-spoofing Using a Simple Attention Module and Joint Optimization Based on Additive Angular Margin Loss and Meta-learning Nov 17, 2022 Binary Classification Meta-Learning
— Unverified 0Towards Building Text-To-Speech Systems for the Next Billion Users Nov 17, 2022 Diversity Speech Synthesis
Code Code Available 2Grad-StyleSpeech: Any-speaker Adaptive Text-to-Speech Synthesis with Diffusion Models Nov 17, 2022 Speech Synthesis text-to-speech
— Unverified 0The Potential of Neural Speech Synthesis-based Data Augmentation for Personalized Speech Enhancement Nov 14, 2022 Data Augmentation Speech Enhancement
— Unverified 0OverFlow: Putting flows on top of neural transducers for better TTS Nov 13, 2022 Normalising Flows Speech Synthesis
Code Code Available 1Semi-supervised learning for continuous emotional intensity controllable speech synthesis with disentangled representations Nov 11, 2022 Emotional Speech Synthesis Speech Synthesis
— Unverified 0PhaseAug: A Differentiable Augmentation for Speech Synthesis to Simulate One-to-Many Mapping Nov 8, 2022 Generative Adversarial Network Speech Synthesis
Code Code Available 1ERNIE-SAT: Speech and Text Joint Pretraining for Cross-Lingual Multi-Speaker Text-to-Speech Nov 7, 2022 Representation Learning Speech Representation Learning
Code Code Available 6Accented Text-to-Speech Synthesis with a Conditional Variational Autoencoder Nov 7, 2022 Speech Synthesis text-to-speech
Code Code Available 1Deliberation Networks and How to Train Them Nov 6, 2022 Machine Translation Speech Synthesis
— Unverified 0Self-Supervised Learning for Speech Enhancement through Synthesis Nov 4, 2022 Denoising Self-Supervised Learning
Code Code Available 0SAMO: Speaker Attractor Multi-Center One-Class Learning for Voice Anti-Spoofing Nov 4, 2022 Diversity Speaker Verification
Code Code Available 1Predicting phoneme-level prosody latents using AR and flow-based Prior Networks for expressive speech synthesis Nov 2, 2022 Expressive Speech Synthesis Speech Synthesis
— Unverified 0Taiwanese-Accented Mandarin and English Multi-Speaker Talking-Face Synthesis System Nov 1, 2022 Face Generation Speech Synthesis
— Unverified 0A Preliminary Study on Mandarin-Hakka neural machine translation using small-sized data Nov 1, 2022 Machine Translation Speech Synthesis
— Unverified 0Development of Mandarin-English code-switching speech synthesis system Nov 1, 2022 Sentence Speech Synthesis
— Unverified 0Technology Pipeline for Large Scale Cross-Lingual Dubbing of Lecture Videos into Multiple Indian Languages Nov 1, 2022 Chunking Rhythm
— Unverified 0Learning utterance-level representations through token-level acoustic latents prediction for Expressive Speech Synthesis Nov 1, 2022 Disentanglement Diversity
— Unverified 0Adapter-Based Extension of Multi-Speaker Text-to-Speech Model for New Speakers Nov 1, 2022 parameter-efficient fine-tuning Speech Synthesis
— Unverified 0Towards Developing State-of-the-Art TTS Synthesisers for 13 Indian Languages with Signal Processing aided Alignments Oct 31, 2022 Speech Synthesis
— Unverified 0Period VITS: Variational Inference with Explicit Pitch Modeling for End-to-end Emotional Speech Synthesis Oct 28, 2022 Decoder Diversity
— Unverified 0Evaluating context-invariance in unsupervised speech representations Oct 27, 2022 Language Modelling speech-recognition
Code Code Available 0FCTalker: Fine and Coarse Grained Context Modeling for Expressive Conversational Speech Synthesis Oct 27, 2022 Speech Synthesis text-to-speech
Code Code Available 1Articulation GAN: Unsupervised modeling of articulatory learning Oct 27, 2022 Generative Adversarial Network Speech Synthesis
Code Code Available 1A Fast and Accurate Pitch Estimation Algorithm Based on the Pseudo Wigner-Ville Distribution Oct 27, 2022 Speech Synthesis
Code Code Available 0Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech Oct 27, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0RedPen: Region- and Reason-Annotated Dataset of Unnatural Speech Oct 26, 2022 Speech Synthesis
— Unverified 0