RedPen: Region- and Reason-Annotated Dataset of Unnatural Speech Oct 26, 2022 Speech Synthesis
— Unverified 0Semi-Supervised Learning Based on Reference Model for Low-resource TTS Oct 25, 2022 Speech Synthesis text-to-speech
— Unverified 0A Data-Driven Investigation of Noise-Adaptive Utterance Generation with Linguistic Modification Oct 19, 2022 Speech Synthesis Text Generation
— Unverified 0Simple and Effective Unsupervised Speech Translation Oct 18, 2022 Domain Adaptation Machine Translation
— Unverified 0Transformer-Based Speech Synthesizer Attribution in an Open Set Scenario Oct 14, 2022 Attribute Misinformation
— Unverified 0Empirical Study Incorporating Linguistic Knowledge on Filled Pauses for Personalized Spontaneous Speech Synthesis Oct 14, 2022 Speech Synthesis Voice Cloning
Code Code Available 0GAN You Hear Me? Reclaiming Unconditional Speech Synthesis from Diffusion Models Oct 11, 2022 Disentanglement Generative Adversarial Network
Code Code Available 1An Overview of Affective Speech Synthesis and Conversion in the Deep Learning Era Oct 6, 2022 Speech Synthesis text-to-speech
— Unverified 0Fully Unsupervised Training of Few-shot Keyword Spotting Oct 6, 2022 Keyword Spotting Metric Learning
— Unverified 0The Sound of Silence: Efficiency of First Digit Features in Synthetic Audio Detection Oct 6, 2022 Speech Synthesis Synthetic Speech Detection
Code Code Available 0Voice Spoofing Countermeasures: Taxonomy, State-of-the-art, experimental analysis of generalizability, open challenges, and the way forward Oct 2, 2022 Misinformation Speaker Verification
Code Code Available 1Unsupervised Multi-scale Expressive Speaking Style Modeling with Hierarchical Context Information for Audiobook Speech Synthesis Oct 1, 2022 Speech Synthesis text-to-speech
— Unverified 0Detection of Prosodic Boundaries in Speech Using Wav2Vec 2.0 Sep 29, 2022 Sentence Speech Synthesis
Code Code Available 1ControlVC: Zero-Shot Voice Conversion with Time-Varying Controls on Pitch and Speed Sep 23, 2022 Pitch control Speech Synthesis
Code Code Available 1EPIC TTS Models: Empirical Pruning Investigations Characterizing Text-To-Speech Models Sep 22, 2022 Speech Synthesis text-to-speech
— Unverified 0MnTTS: An Open-Source Mongolian Text-to-Speech Synthesis Dataset and Accompanied Baseline Sep 22, 2022 Speech Synthesis text-to-speech
Code Code Available 1Controllable Accented Text-to-Speech Synthesis Sep 22, 2022 Speech Synthesis text-to-speech
— Unverified 0An Initial study on Birdsong Re-synthesis Using Neural Vocoders Sep 21, 2022 Resynthesis Speech Synthesis
— Unverified 0AutoLV: Automatic Lecture Video Generator Sep 19, 2022 Speech Synthesis Talking Head Generation
— Unverified 0Decoupled Pronunciation and Prosody Modeling in Meta-Learning-Based Multilingual Speech Synthesis Sep 14, 2022 Decoder Meta-Learning
— Unverified 0ConvNeXt Based Neural Network for Audio Anti-Spoofing Sep 14, 2022 image-classification Image Classification
Code Code Available 0Automated detection of pronunciation errors in non-native English speech employing deep learning Sep 13, 2022 Speech Synthesis
— Unverified 0Deep Speech Synthesis from Articulatory Representations Sep 13, 2022 Speech Synthesis
Code Code Available 1Mlphon: A Multifunctional Grapheme-Phoneme Conversion Tool Using Finite State Transducers Sep 5, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 0Lip-to-Speech Synthesis for Arbitrary Speakers in the Wild Sep 1, 2022 Lip to Speech Synthesis Speech Synthesis
— Unverified 0Audio Deepfake Attribution: An Initial Dataset and Investigation Aug 21, 2022 Audio Generation Binary Classification
— Unverified 0Visualising Model Training via Vowel Space for Text-To-Speech Systems Aug 21, 2022 Speech Synthesis text-to-speech
Code Code Available 1Speech Synthesis with Mixed Emotions Aug 11, 2022 Attribute Emotional Speech Synthesis
— Unverified 0A Study of Modeling Rising Intonation in Cantonese Neural Speech Synthesis Aug 3, 2022 Speech Synthesis text-to-speech
— Unverified 0SoundChoice: Grapheme-to-Phoneme Models with Semantic Disambiguation Jul 27, 2022 Language Modeling Language Modelling
— Unverified 0Transplantation of Conversational Speaking Style with Interjections in Sequence-to-Sequence Speech Synthesis Jul 25, 2022 Data Augmentation Speech Synthesis
— Unverified 0Controllable Data Generation by Deep Learning: A Review Jul 19, 2022 Deep Learning Speech Synthesis
— Unverified 0ProDiff: Progressive Fast Diffusion Model For High-Quality Text-to-Speech Jul 13, 2022 Denoising GPU
Code Code Available 3PoeticTTS -- Controllable Poetry Reading for Literary Studies Jul 11, 2022 Speech Synthesis
— Unverified 0Speaker Anonymization with Phonetic Intermediate Representations Jul 11, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0FastLTS: Non-Autoregressive End-to-End Unconstrained Lip-to-Speech Synthesis Jul 8, 2022 Lip to Speech Synthesis Speech Synthesis
Code Code Available 0End-to-End Binaural Speech Synthesis Jul 8, 2022 Decoder Speech Synthesis
— Unverified 0BERT, can HE predict contrastive focus? Predicting and controlling prominence in neural TTS using a language model Jul 4, 2022 Language Modeling Language Modelling
— Unverified 0Mix and Match: An Empirical Study on Training Corpus Composition for Polyglot Text-To-Speech (TTS) Jul 4, 2022 Speech Synthesis text-to-speech
— Unverified 0Computer-assisted Pronunciation Training -- Speech synthesis is almost all you need Jul 2, 2022 All Speech Synthesis
— Unverified 0Building African Voices Jul 1, 2022 Speech Synthesis text-to-speech
Code Code Available 1TTS-by-TTS 2: Data-selective augmentation for neural speech synthesis using ranking support vector machine with variational autoencoder Jun 30, 2022 Speech Synthesis text-to-speech
— Unverified 0R-MelNet: Reduced Mel-Spectral Modeling for Neural TTS Jun 30, 2022 Decoder GPU
— Unverified 0iEmoTTS: Toward Robust Cross-Speaker Emotion Transfer and Control for Speech Synthesis based on Disentanglement between Prosody and Timbre Jun 29, 2022 Disentanglement Speaker Identification
— Unverified 0Expressive, Variable, and Controllable Duration Modelling in TTS Jun 28, 2022 Normalising Flows Speech Synthesis
— Unverified 0Show Me Your Face, And I'll Tell You How You Speak Jun 28, 2022 Lip to Speech Synthesis Speech Synthesis
Code Code Available 1Self-supervised Context-aware Style Representation for Expressive Speech Synthesis Jun 25, 2022 Contrastive Learning Deep Clustering
— Unverified 0WOLONet: Wave Outlooker for Efficient and High Fidelity Speech Synthesis Jun 20, 2022 CPU Speech Synthesis
— Unverified 0Acoustic Modeling for End-to-End Empathetic Dialogue Speech Synthesis Using Linguistic and Prosodic Contexts of Dialogue History Jun 16, 2022 Self-Supervised Learning Sentence
— Unverified 0Automatic Prosody Annotation with Pre-Trained Text-Speech Model Jun 16, 2022 Speech Synthesis text-to-speech
Code Code Available 1