Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling Dec 19, 2023 Contrastive Learning Speech Synthesis
Code Code Available 1What to Remember: Self-Adaptive Continual Learning for Audio Deepfake Detection Dec 15, 2023 Audio Deepfake Detection Continual Learning
Code Code Available 1Neural Text to Articulate Talk: Deep Text to Audiovisual Speech Synthesis achieving both Auditory and Photo-realism Dec 11, 2023 Face Generation Lip Reading
Code Code Available 1Learning Arousal-Valence Representation from Categorical Emotion Labels of Speech Nov 24, 2023 Dimensionality Reduction Emotion Classification
Code Code Available 1APNet2: High-quality and High-efficiency Neural Vocoder with Direct Prediction of Amplitude and Phase Spectra Nov 20, 2023 Speech Synthesis
Code Code Available 1Improved Child Text-to-Speech Synthesis through Fastpitch-based Transfer Learning Nov 7, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1ArTST: Arabic Text and Speech Transformer Oct 25, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1AutoDiff: combining Auto-encoder and Diffusion model for tabular data synthesizing Oct 24, 2023 Language Modeling Language Modelling
Code Code Available 1Evaluating Speech Synthesis by Training Recognizers on Synthetic Speech Oct 1, 2023 speech-recognition Speech Recognition
Code Code Available 1Towards Joint Modeling of Dialogue Response and Speech Synthesis based on Large Language Model Sep 20, 2023 Chatbot Language Modeling
Code Code Available 1QS-TTS: Towards Semi-Supervised Text-to-Speech Synthesis via Vector-Quantized Self-Supervised Speech Representation Learning Aug 31, 2023 Representation Learning Speech Representation Learning
Code Code Available 1DiffV2S: Diffusion-based Video-to-Speech Synthesis with Vision-guided Speaker Embedding Aug 15, 2023 Speech Synthesis
Code Code Available 1Textless Unit-to-Unit training for Many-to-Many Multilingual Speech-to-Speech Translation Aug 3, 2023 Decoder Quantization
Code Code Available 1DiffProsody: Diffusion-based Latent Prosody Generation for Expressive Speech Synthesis with Prosody Conditional Adversarial Training Jul 31, 2023 Denoising Expressive Speech Synthesis
Code Code Available 1SC VALL-E: Style-Controllable Zero-Shot Text to Speech Synthesizer Jul 20, 2023 Expressive Speech Synthesis Language Modelling
Code Code Available 1Deep Speech Synthesis from MRI-Based Articulatory Representations Jul 5, 2023 Computational Efficiency Denoising
Code Code Available 1Disentanglement in a GAN for Unconditional Speech Synthesis Jul 4, 2023 Disentanglement Generative Adversarial Network
Code Code Available 1EmoSpeech: Guiding FastSpeech2 Towards Emotional Text to Speech Jun 28, 2023 Emotion Recognition Speech Synthesis
Code Code Available 1Intelligible Lip-to-Speech Synthesis with Speech Units May 31, 2023 Lip to Speech Synthesis Speech Synthesis
Code Code Available 1ADAPTERMIX: Exploring the Efficacy of Mixture of Adapters for Low-Resource TTS Adaptation May 29, 2023 Speech Synthesis text-to-speech
Code Code Available 1Automatic Tuning of Loss Trade-offs without Hyper-parameter Search in End-to-End Zero-Shot Speech Synthesis May 26, 2023 Decoder Speech Synthesis
Code Code Available 1Multilingual Text-to-Speech Synthesis for Turkic Languages Using Transliteration May 25, 2023 Speech Synthesis text-to-speech
Code Code Available 1EMNS /Imz/ Corpus: An emotive single-speaker dataset for narrative storytelling in games, television and graphic novels May 22, 2023 Expressive Speech Synthesis Speech Synthesis
Code Code Available 1Scaling Speech Technology to 1,000+ Languages May 22, 2023 Automatic Speech Recognition Language Identification
Code Code Available 1Diffusion-Based Mel-Spectrogram Enhancement for Personalized Speech Synthesis with Found Data May 18, 2023 Speech Enhancement Speech Synthesis
Code Code Available 1Bts-e: Audio deepfake detection using breathing-talking-silence encoder May 5, 2023 Audio Deepfake Detection DeepFake Detection
Code Code Available 1Evaluating Parameter-Efficient Transfer Learning Approaches on SURE Benchmark for Speech Understanding Mar 2, 2023 Speech Synthesis text-to-speech
Code Code Available 1Imaginary Voice: Face-styled Diffusion Model for Text-to-Speech Feb 27, 2023 Speech Synthesis text-to-speech
Code Code Available 1Lip-to-Speech Synthesis in the Wild with Multi-task Learning Feb 17, 2023 Lip to Speech Synthesis Multi-Task Learning
Code Code Available 1Cross-modal information fusion for voice spoofing detection Feb 1, 2023 Automatic Speech Recognition fake voice detection
Code Code Available 1Towards Voice Reconstruction from EEG during Imagined Speech Jan 2, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1RWEN-TTS: Relation-aware Word Encoding Network for Natural Text-to-Speech Synthesis Dec 15, 2022 Relation Speech Synthesis
Code Code Available 1MnTTS2: An Open-Source Multi-Speaker Mongolian Text-to-Speech Synthesis Dataset Dec 11, 2022 Speech Synthesis text-to-speech
Code Code Available 1Embedding a Differentiable Mel-cepstral Synthesis Filter to a Neural Speech Synthesis System Nov 21, 2022 GPU Speech Synthesis
Code Code Available 1OverFlow: Putting flows on top of neural transducers for better TTS Nov 13, 2022 Normalising Flows Speech Synthesis
Code Code Available 1PhaseAug: A Differentiable Augmentation for Speech Synthesis to Simulate One-to-Many Mapping Nov 8, 2022 Generative Adversarial Network Speech Synthesis
Code Code Available 1Accented Text-to-Speech Synthesis with a Conditional Variational Autoencoder Nov 7, 2022 Speech Synthesis text-to-speech
Code Code Available 1SAMO: Speaker Attractor Multi-Center One-Class Learning for Voice Anti-Spoofing Nov 4, 2022 Diversity Speaker Verification
Code Code Available 1Articulation GAN: Unsupervised modeling of articulatory learning Oct 27, 2022 Generative Adversarial Network Speech Synthesis
Code Code Available 1FCTalker: Fine and Coarse Grained Context Modeling for Expressive Conversational Speech Synthesis Oct 27, 2022 Speech Synthesis text-to-speech
Code Code Available 1GAN You Hear Me? Reclaiming Unconditional Speech Synthesis from Diffusion Models Oct 11, 2022 Disentanglement Generative Adversarial Network
Code Code Available 1Voice Spoofing Countermeasures: Taxonomy, State-of-the-art, experimental analysis of generalizability, open challenges, and the way forward Oct 2, 2022 Misinformation Speaker Verification
Code Code Available 1Detection of Prosodic Boundaries in Speech Using Wav2Vec 2.0 Sep 29, 2022 Sentence Speech Synthesis
Code Code Available 1ControlVC: Zero-Shot Voice Conversion with Time-Varying Controls on Pitch and Speed Sep 23, 2022 Pitch control Speech Synthesis
Code Code Available 1MnTTS: An Open-Source Mongolian Text-to-Speech Synthesis Dataset and Accompanied Baseline Sep 22, 2022 Speech Synthesis text-to-speech
Code Code Available 1Deep Speech Synthesis from Articulatory Representations Sep 13, 2022 Speech Synthesis
Code Code Available 1Visualising Model Training via Vowel Space for Text-To-Speech Systems Aug 21, 2022 Speech Synthesis text-to-speech
Code Code Available 1Building African Voices Jul 1, 2022 Speech Synthesis text-to-speech
Code Code Available 1Show Me Your Face, And I'll Tell You How You Speak Jun 28, 2022 Lip to Speech Synthesis Speech Synthesis
Code Code Available 1Automatic Prosody Annotation with Pre-Trained Text-Speech Model Jun 16, 2022 Speech Synthesis text-to-speech
Code Code Available 1