Mitigating Unauthorized Speech Synthesis for Voice Protection Oct 28, 2024 Data Augmentation Face Swapping
Code Code Available 15 MnTTS: An Open-Source Mongolian Text-to-Speech Synthesis Dataset and Accompanied Baseline Sep 22, 2022 Speech Synthesis text-to-speech
Code Code Available 15 End-to-End Adversarial Text-to-Speech Jun 5, 2020 Adversarial Text Dynamic Time Warping
Code Code Available 15 MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis Oct 8, 2019 CPU GPU
Code Code Available 15 Emotion Rendering for Conversational Speech Synthesis with Heterogeneous Graph-Based Context Modeling Dec 19, 2023 Contrastive Learning Speech Synthesis
Code Code Available 15 TTS-Portuguese Corpus: a corpus for speech synthesis in Brazilian Portuguese May 11, 2020 Denoising Speech Synthesis
Code Code Available 15 Evaluating Speech Synthesis by Training Recognizers on Synthetic Speech Oct 1, 2023 speech-recognition Speech Recognition
Code Code Available 15 Evaluating Parameter-Efficient Transfer Learning Approaches on SURE Benchmark for Speech Understanding Mar 2, 2023 Speech Synthesis text-to-speech
Code Code Available 15 Articulation GAN: Unsupervised modeling of articulatory learning Oct 27, 2022 Generative Adversarial Network Speech Synthesis
Code Code Available 15 Mixer-TTS: non-autoregressive, fast and compact text-to-speech model conditioned on language model embeddings Oct 7, 2021 Language Modeling Language Modelling
Code Code Available 15 Meta-TTS: Meta-Learning for Few-Shot Speaker Adaptive Text-to-Speech Nov 7, 2021 Meta-Learning Speech Synthesis
Code Code Available 15 Generative Expressive Conversational Speech Synthesis Jul 31, 2024 Speech Synthesis
Code Code Available 15 ArTST: Arabic Text and Speech Transformer Oct 25, 2023 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 APNet2: High-quality and High-efficiency Neural Vocoder with Direct Prediction of Amplitude and Phase Spectra Nov 20, 2023 Speech Synthesis
Code Code Available 15 ASR data augmentation in low-resource settings using cross-lingual multi-speaker TTS and cross-lingual voice conversion Mar 29, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 A Spectral Energy Distance for Parallel Speech Synthesis Aug 3, 2020 scoring rule Speech Synthesis
Code Code Available 15 EmoSpeech: Guiding FastSpeech2 Towards Emotional Text to Speech Jun 28, 2023 Emotion Recognition Speech Synthesis
Code Code Available 15 Assem-VC: Realistic Voice Conversion by Assembling Modern Speech Synthesis Techniques Apr 2, 2021 Decoder Rhythm
Code Code Available 15 Embedding a Differentiable Mel-cepstral Synthesis Filter to a Neural Speech Synthesis System Nov 21, 2022 GPU Speech Synthesis
Code Code Available 15 EMNS /Imz/ Corpus: An emotive single-speaker dataset for narrative storytelling in games, television and graphic novels May 22, 2023 Expressive Speech Synthesis Speech Synthesis
Code Code Available 15 End-to-End Zero-Shot Voice Conversion with Location-Variable Convolutions May 19, 2022 Speech Synthesis Style Transfer
Code Code Available 15 From Speaker Verification to Multispeaker Speech Synthesis, Deep Transfer with Feedback Constraint May 10, 2020 Speaker Verification Speech Synthesis
Code Code Available 15 Effective Deep Learning Models for Automatic Diacritization of Arabic Text Nov 1, 2020 Arabic Text Diacritization Decoder
Code Code Available 15 Lip to Speech Synthesis with Visual Context Attentional GAN Apr 4, 2022 Contrastive Learning Generative Adversarial Network
Code Code Available 15 EdiTTS: Score-based Editing for Controllable Text-to-Speech Oct 6, 2021 Speech Synthesis Speech-to-Text
Code Code Available 15 A Survey on Non-Autoregressive Generation for Neural Machine Translation and Beyond Apr 20, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 15 Dynamical Variational Autoencoders: A Comprehensive Review Aug 28, 2020 3D Human Dynamics Resynthesis
Code Code Available 15 Effective Use of Variational Embedding Capacity in Expressive End-to-End Speech Synthesis Jun 8, 2019 Expressive Speech Synthesis Speech Synthesis
Code Code Available 15 dMel: Speech Tokenization made Simple Jul 22, 2024 Decoder Language Modeling
Code Code Available 15 Disentanglement in a GAN for Unconditional Speech Synthesis Jul 4, 2023 Disentanglement Generative Adversarial Network
Code Code Available 15 EfficientNet-Absolute Zero for Continuous Speech Keyword Spotting Dec 31, 2020 Keyword Spotting Keyword Spotting CSS
Code Code Available 15 Learning pronunciation from a foreign language in speech synthesis networks Nov 23, 2018 Speech Synthesis
Code Code Available 15 Lip-to-Speech Synthesis in the Wild with Multi-task Learning Feb 17, 2023 Lip to Speech Synthesis Multi-Task Learning
Code Code Available 15 Diffusion-Based Voice Conversion with Fast Maximum Likelihood Sampling Scheme Sep 28, 2021 Speech Synthesis Voice Conversion
Code Code Available 15 Diffusion-Based Mel-Spectrogram Enhancement for Personalized Speech Synthesis with Found Data May 18, 2023 Speech Enhancement Speech Synthesis
Code Code Available 15 DiffV2S: Diffusion-based Video-to-Speech Synthesis with Vision-guided Speaker Embedding Aug 15, 2023 Speech Synthesis
Code Code Available 15 DiffProsody: Diffusion-based Latent Prosody Generation for Expressive Speech Synthesis with Prosody Conditional Adversarial Training Jul 31, 2023 Denoising Expressive Speech Synthesis
Code Code Available 15 A^3T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editing Mar 18, 2022 Representation Learning Speaker Verification
Code Code Available 15 DiffWave: A Versatile Diffusion Model for Audio Synthesis Sep 21, 2020 Audio Synthesis Diversity
Code Code Available 15 Learning Disentangled Phone and Speaker Representations in a Semi-Supervised VQ-VAE Paradigm Oct 21, 2020 speaker-diarization Speaker Diarization
Code Code Available 15 A Neuro-AI Interface for Evaluating Generative Adversarial Networks Mar 5, 2020 Speech Synthesis
Code Code Available 15 Digital Voicing of Silent Speech Oct 6, 2020 Electromyography (EMG) Speech Synthesis
Code Code Available 15 AnCoGen: Analysis, Control and Generation of Speech with a Masked Autoencoder Jan 9, 2025 Pitch Classification Pitch control
Code Code Available 15 Accented Text-to-Speech Synthesis with a Conditional Variational Autoencoder Nov 7, 2022 Speech Synthesis text-to-speech
Code Code Available 15 Developing multilingual speech synthesis system for Ojibwe, Mi'kmaq, and Maliseet Feb 4, 2025 Speech Synthesis text-to-speech
Code Code Available 15 Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis May 17, 2020 Lip Reading Lip to Speech Synthesis
Code Code Available 15 KazakhTTS: An Open-Source Kazakh Text-to-Speech Synthesis Dataset Apr 17, 2021 Speech Synthesis text-to-speech
Code Code Available 15 ITAcotron 2: Transfering English Speech Synthesis Architectures and Speech Features to Italian Nov 1, 2021 Speech Synthesis
Code Code Available 15 KazEmoTTS: A Dataset for Kazakh Emotional Text-to-Speech Synthesis Apr 1, 2024 Speech Synthesis text-to-speech
Code Code Available 15 Deep Speech Synthesis from Articulatory Representations Sep 13, 2022 Speech Synthesis
Code Code Available 15