RF-Next: Efficient Receptive Field Search for Convolutional Neural Networks Jun 14, 2022 Action Segmentation Instance Segmentation
Code Code Available 1TranSpeech: Speech-to-Speech Translation With Bilateral Perturbation May 25, 2022 Representation Learning Rhythm
Code Code Available 1End-to-End Zero-Shot Voice Conversion with Location-Variable Convolutions May 19, 2022 Speech Synthesis Style Transfer
Code Code Available 1Deep Learning Enabled Semantic Communications with Speech Recognition and Synthesis May 9, 2022 Deep Learning Semantic Communication
Code Code Available 1SVTS: Scalable Video-to-Speech Synthesis May 4, 2022 Speech Synthesis
Code Code Available 1Requirements and Motivations of Low-Resource Speech Synthesis for Language Revitalization May 1, 2022 Speech Synthesis
Code Code Available 1A Survey on Non-Autoregressive Generation for Neural Machine Translation and Beyond Apr 20, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1Lip to Speech Synthesis with Visual Context Attentional GAN Apr 4, 2022 Contrastive Learning Generative Adversarial Network
Code Code Available 1ASR data augmentation in low-resource settings using cross-lingual multi-speaker TTS and cross-lingual voice conversion Mar 29, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1Unsupervised Text-to-Speech Synthesis by Unsupervised Automatic Speech Recognition Mar 29, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1A^3T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editing Mar 18, 2022 Representation Learning Speaker Verification
Code Code Available 1VocBench: A Neural Vocoder Benchmark for Speech Synthesis Dec 6, 2021 Speech Synthesis
Code Code Available 1YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone Dec 4, 2021 Speech Synthesis Text-To-Speech Synthesis
Code Code Available 1Meta-TTS: Meta-Learning for Few-Shot Speaker Adaptive Text-to-Speech Nov 7, 2021 Meta-Learning Speech Synthesis
Code Code Available 1Synthesizing Speech from Intracranial Depth Electrodes using an Encoder-Decoder Framework Nov 2, 2021 Decoder EEG
Code Code Available 1ITAcotron 2: Transfering English Speech Synthesis Architectures and Speech Features to Italian Nov 1, 2021 Speech Synthesis
Code Code Available 1FMFCC-A: A Challenging Mandarin Dataset for Synthetic Speech Detection Oct 18, 2021 Speech Synthesis Synthetic Speech Detection
Code Code Available 1SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing Oct 14, 2021 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1Fine-grained style control in Transformer-based Text-to-speech Synthesis Oct 12, 2021 Inductive Bias Speech Synthesis
Code Code Available 1Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised Training in Text-To-Speech Oct 8, 2021 Emotion Interpretation Expressive Speech Synthesis
Code Code Available 1StrengthNet: Deep Learning-based Emotion Strength Assessment for Emotional Speech Synthesis Oct 7, 2021 Attribute Data Augmentation
Code Code Available 1Mixer-TTS: non-autoregressive, fast and compact text-to-speech model conditioned on language model embeddings Oct 7, 2021 Language Modeling Language Modelling
Code Code Available 1EdiTTS: Score-based Editing for Controllable Text-to-Speech Oct 6, 2021 Speech Synthesis Speech-to-Text
Code Code Available 1Diffusion-Based Voice Conversion with Fast Maximum Likelihood Sampling Scheme Sep 28, 2021 Speech Synthesis Voice Conversion
Code Code Available 1Neural HMMs are all you need (for high-quality attention-free TTS) Aug 30, 2021 All Speech Synthesis
Code Code Available 1One TTS Alignment To Rule Them All Aug 23, 2021 All Speech Synthesis
Code Code Available 1Conditional Sound Generation Using Neural Discrete Time-Frequency Representation Learning Jul 21, 2021 Diversity Music Generation
Code Code Available 1A Survey on Neural Speech Synthesis Jun 29, 2021 Speech Synthesis Survey
Code Code Available 1FastPitchFormant: Source-filter based Decomposed Modeling for Speech Synthesis Jun 29, 2021 Speech Synthesis text-to-speech
Code Code Available 1WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis Jun 17, 2021 Speech Synthesis text-to-speech
Code Code Available 1RyanSpeech: A Corpus for Conversational Text-to-Speech Synthesis Jun 15, 2021 speech-recognition Speech Recognition
Code Code Available 1Enhancing Speaking Styles in Conversational Text-to-Speech Synthesis with Graph-based Multi-modal Context Modeling Jun 11, 2021 Speech Synthesis text-to-speech
Code Code Available 1RAD-TTS: Parallel Flow-Based TTS with Robust Alignment Learning and Diverse Synthesis Jun 2, 2021 Diversity Rhythm
Code Code Available 1Byakto Speech: Real-time long speech synthesis with convolutional neural network: Transfer learning from English to Bangla May 31, 2021 Deep Learning speech-recognition
Code Code Available 1Grad-TTS: A Diffusion Probabilistic Model for Text-to-Speech May 13, 2021 Decoder Speech Synthesis
Code Code Available 1Deep Learning Based Assessment of Synthetic Speech Naturalness Apr 23, 2021 Deep Learning Prediction
Code Code Available 1KazakhTTS: An Open-Source Kazakh Text-to-Speech Synthesis Dataset Apr 17, 2021 Speech Synthesis text-to-speech
Code Code Available 1TalkNet 2: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis with Explicit Pitch and Duration Prediction Apr 16, 2021 Speech Synthesis text-to-speech
Code Code Available 1Assem-VC: Realistic Voice Conversion by Assembling Modern Speech Synthesis Techniques Apr 2, 2021 Decoder Rhythm
Code Code Available 1Multilingual Byte2Speech Models for Scalable Low-resource Speech Synthesis Mar 5, 2021 Speech Synthesis
Code Code Available 1CDPAM: Contrastive learning for perceptual audio similarity Feb 9, 2021 Contrastive Learning Speech Enhancement
Code Code Available 1Text-Free Image-to-Speech Synthesis Using Learned Segmental Units Dec 31, 2020 Image Captioning Speech Synthesis
Code Code Available 1EfficientNet-Absolute Zero for Continuous Speech Keyword Spotting Dec 31, 2020 Keyword Spotting Keyword Spotting CSS
Code Code Available 1Semi-supervised URL Segmentation with Recurrent Neural Networks Pre-trained on Knowledge Graph Entities Dec 1, 2020 Chinese Word Segmentation Speech Synthesis
Code Code Available 1TFGAN: Time and Frequency Domain Based Generative Adversarial Network for High-fidelity Speech Synthesis Nov 24, 2020 Generative Adversarial Network Speech Synthesis
Code Code Available 1Wave-Tacotron: Spectrogram-free end-to-end text-to-speech synthesis Nov 6, 2020 Decoder Speech Synthesis
Code Code Available 1Semi-supervised URL Segmentation with Recurrent Neural NetworksPre-trained on Knowledge Graph Entities Nov 5, 2020 Chinese Word Segmentation Speech Synthesis
Code Code Available 1Effective Deep Learning Models for Automatic Diacritization of Arabic Text Nov 1, 2020 Arabic Text Diacritization Decoder
Code Code Available 1Learning Disentangled Phone and Speaker Representations in a Semi-Supervised VQ-VAE Paradigm Oct 21, 2020 speaker-diarization Speaker Diarization
Code Code Available 1Digital Voicing of Silent Speech Oct 6, 2020 Electromyography (EMG) Speech Synthesis
Code Code Available 1