Controllable speech synthesis by learning discrete phoneme-level prosodic representations Nov 29, 2022 Clustering Speech Synthesis
— Unverified 0Contextual Expressive Text-to-Speech Nov 26, 2022 Speech Synthesis text-to-speech
— Unverified 0Efficient Incremental Text-to-Speech on GPUs Nov 25, 2022 GPU Speech Synthesis
— Unverified 0PromptTTS: Controllable Text-to-Speech with Text Descriptions Nov 22, 2022 Decoder Speech Synthesis
Code Code Available 0LA-VocE: Low-SNR Audio-visual Speech Enhancement using Neural Vocoders Nov 20, 2022 Speech Enhancement Speech Synthesis
— Unverified 0Multi-Speaker Expressive Speech Synthesis via Multiple Factors Decoupling Nov 19, 2022 Expressive Speech Synthesis Speech Synthesis
— Unverified 0Grad-StyleSpeech: Any-speaker Adaptive Text-to-Speech Synthesis with Diffusion Models Nov 17, 2022 Speech Synthesis text-to-speech
— Unverified 0Audio Anti-spoofing Using a Simple Attention Module and Joint Optimization Based on Additive Angular Margin Loss and Meta-learning Nov 17, 2022 Binary Classification Meta-Learning
— Unverified 0The Potential of Neural Speech Synthesis-based Data Augmentation for Personalized Speech Enhancement Nov 14, 2022 Data Augmentation Speech Enhancement
— Unverified 0Semi-supervised learning for continuous emotional intensity controllable speech synthesis with disentangled representations Nov 11, 2022 Emotional Speech Synthesis Speech Synthesis
— Unverified 0Deliberation Networks and How to Train Them Nov 6, 2022 Machine Translation Speech Synthesis
— Unverified 0Self-Supervised Learning for Speech Enhancement through Synthesis Nov 4, 2022 Denoising Self-Supervised Learning
Code Code Available 0Predicting phoneme-level prosody latents using AR and flow-based Prior Networks for expressive speech synthesis Nov 2, 2022 Expressive Speech Synthesis Speech Synthesis
— Unverified 0A Preliminary Study on Mandarin-Hakka neural machine translation using small-sized data Nov 1, 2022 Machine Translation Speech Synthesis
— Unverified 0Taiwanese-Accented Mandarin and English Multi-Speaker Talking-Face Synthesis System Nov 1, 2022 Face Generation Speech Synthesis
— Unverified 0Technology Pipeline for Large Scale Cross-Lingual Dubbing of Lecture Videos into Multiple Indian Languages Nov 1, 2022 Chunking Rhythm
— Unverified 0Development of Mandarin-English code-switching speech synthesis system Nov 1, 2022 Sentence Speech Synthesis
— Unverified 0Adapter-Based Extension of Multi-Speaker Text-to-Speech Model for New Speakers Nov 1, 2022 parameter-efficient fine-tuning Speech Synthesis
— Unverified 0Learning utterance-level representations through token-level acoustic latents prediction for Expressive Speech Synthesis Nov 1, 2022 Disentanglement Diversity
— Unverified 0Towards Developing State-of-the-Art TTS Synthesisers for 13 Indian Languages with Signal Processing aided Alignments Oct 31, 2022 Speech Synthesis
— Unverified 0Period VITS: Variational Inference with Explicit Pitch Modeling for End-to-end Emotional Speech Synthesis Oct 28, 2022 Decoder Diversity
— Unverified 0Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech Oct 27, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0Evaluating context-invariance in unsupervised speech representations Oct 27, 2022 Language Modelling speech-recognition
Code Code Available 0A Fast and Accurate Pitch Estimation Algorithm Based on the Pseudo Wigner-Ville Distribution Oct 27, 2022 Speech Synthesis
Code Code Available 0RedPen: Region- and Reason-Annotated Dataset of Unnatural Speech Oct 26, 2022 Speech Synthesis
— Unverified 0Bloom Library: Multimodal Datasets in 300+ Languages for a Variety of Downstream Tasks Oct 26, 2022 Image Captioning Language Modeling
— Unverified 0Semi-Supervised Learning Based on Reference Model for Low-resource TTS Oct 25, 2022 Speech Synthesis text-to-speech
— Unverified 0A Data-Driven Investigation of Noise-Adaptive Utterance Generation with Linguistic Modification Oct 19, 2022 Speech Synthesis Text Generation
— Unverified 0Simple and Effective Unsupervised Speech Translation Oct 18, 2022 Domain Adaptation Machine Translation
— Unverified 0Empirical Study Incorporating Linguistic Knowledge on Filled Pauses for Personalized Spontaneous Speech Synthesis Oct 14, 2022 Speech Synthesis Voice Cloning
Code Code Available 0Transformer-Based Speech Synthesizer Attribution in an Open Set Scenario Oct 14, 2022 Attribute Misinformation
— Unverified 0An Overview of Affective Speech Synthesis and Conversion in the Deep Learning Era Oct 6, 2022 Speech Synthesis text-to-speech
— Unverified 0The Sound of Silence: Efficiency of First Digit Features in Synthetic Audio Detection Oct 6, 2022 Speech Synthesis Synthetic Speech Detection
Code Code Available 0Fully Unsupervised Training of Few-shot Keyword Spotting Oct 6, 2022 Keyword Spotting Metric Learning
— Unverified 0Unsupervised Multi-scale Expressive Speaking Style Modeling with Hierarchical Context Information for Audiobook Speech Synthesis Oct 1, 2022 Speech Synthesis text-to-speech
— Unverified 0EPIC TTS Models: Empirical Pruning Investigations Characterizing Text-To-Speech Models Sep 22, 2022 Speech Synthesis text-to-speech
— Unverified 0Controllable Accented Text-to-Speech Synthesis Sep 22, 2022 Speech Synthesis text-to-speech
— Unverified 0An Initial study on Birdsong Re-synthesis Using Neural Vocoders Sep 21, 2022 Resynthesis Speech Synthesis
— Unverified 0AutoLV: Automatic Lecture Video Generator Sep 19, 2022 Speech Synthesis Talking Head Generation
— Unverified 0ConvNeXt Based Neural Network for Audio Anti-Spoofing Sep 14, 2022 image-classification Image Classification
Code Code Available 0Decoupled Pronunciation and Prosody Modeling in Meta-Learning-Based Multilingual Speech Synthesis Sep 14, 2022 Decoder Meta-Learning
— Unverified 0Automated detection of pronunciation errors in non-native English speech employing deep learning Sep 13, 2022 Speech Synthesis
— Unverified 0Mlphon: A Multifunctional Grapheme-Phoneme Conversion Tool Using Finite State Transducers Sep 5, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 0Lip-to-Speech Synthesis for Arbitrary Speakers in the Wild Sep 1, 2022 Lip to Speech Synthesis Speech Synthesis
— Unverified 0Audio Deepfake Attribution: An Initial Dataset and Investigation Aug 21, 2022 Audio Generation Binary Classification
— Unverified 0Speech Synthesis with Mixed Emotions Aug 11, 2022 Attribute Emotional Speech Synthesis
— Unverified 0A Study of Modeling Rising Intonation in Cantonese Neural Speech Synthesis Aug 3, 2022 Speech Synthesis text-to-speech
— Unverified 0SoundChoice: Grapheme-to-Phoneme Models with Semantic Disambiguation Jul 27, 2022 Language Modeling Language Modelling
— Unverified 0Transplantation of Conversational Speaking Style with Interjections in Sequence-to-Sequence Speech Synthesis Jul 25, 2022 Data Augmentation Speech Synthesis
— Unverified 0Controllable Data Generation by Deep Learning: A Review Jul 19, 2022 Deep Learning Speech Synthesis
— Unverified 0