VisageSynTalk: Unseen Speaker Video-to-Speech Synthesis via Speech-Visage Feature Selection Jun 15, 2022 feature selection Speech Synthesis
— Unverified 0RF-Next: Efficient Receptive Field Search for Convolutional Neural Networks Jun 14, 2022 Action Segmentation Instance Segmentation
Code Code Available 1BigVGAN: A Universal Neural Vocoder with Large-Scale Training Jun 9, 2022 Audio Generation Audio Synthesis
Code Code Available 3Unsupervised TTS Acoustic Modeling for TTS with Conditional Disentangled Sequential VAE Jun 6, 2022 Representation Learning Speech Representation Learning
— Unverified 0Pronunciation Dictionary-Free Multilingual Speech Synthesis by Combining Unsupervised and Supervised Phonetic Representations Jun 2, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0BU-TTS: An Open-Source, Bilingual Welsh-English, Text-to-Speech Corpus Jun 1, 2022 Speech Synthesis text-to-speech
— Unverified 0SyntAct: A Synthesized Database of Basic Emotions Jun 1, 2022 Emotion Recognition Speech Emotion Recognition
— Unverified 0Investigating Inter- and Intra-speaker Voice Conversion using Audiobooks Jun 1, 2022 Speech Synthesis text-to-speech
— Unverified 0Exploring Transfer Learning for Urdu Speech Synthesis Jun 1, 2022 Speech Synthesis text-to-speech
— Unverified 0Building Open-source Speech Technology for Low-resource Minority Languages with SáMi as an Example – Tools, Methods and Experiments Jun 1, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
— Unverified 0AiRO - an Interactive Learning Tool for Children at Risk of Dyslexia Jun 1, 2022 Speech Synthesis
— Unverified 0Preparing an Endangered Language for the Digital Age: The Case of Judeo-Spanish May 31, 2022 Machine Translation Speech Synthesis
Code Code Available 0StyleTTS: A Style-Based Generative Model for Natural and Diverse Text-to-Speech Synthesis May 30, 2022 Data Augmentation Self-Supervised Learning
Code Code Available 2TranSpeech: Speech-to-Speech Translation With Bilateral Perturbation May 25, 2022 Representation Learning Rhythm
Code Code Available 1PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit May 20, 2022 All Automatic Speech Recognition (ASR)
Code Code Available 6End-to-End Zero-Shot Voice Conversion with Location-Variable Convolutions May 19, 2022 Speech Synthesis Style Transfer
Code Code Available 1SDS-200: A Swiss German Speech to Standard German Text Corpus May 19, 2022 Speech Synthesis Translation
Code Code Available 0Macedonian Speech Synthesis for Assistive Technology Applications May 18, 2022 Deep Learning Pitch control
— Unverified 0GenerSpeech: Towards Style Transfer for Generalizable Out-Of-Domain Text-to-Speech May 15, 2022 Speech Synthesis Style Transfer
Code Code Available 2Real-Time Packet Loss Concealment With Mixed Generative and Predictive Model May 11, 2022 Packet Loss Concealment Speech Enhancement
Code Code Available 3Read the Room: Adapting a Robot's Voice to Ambient and Social Contexts May 10, 2022 Speech Synthesis Voice Conversion
Code Code Available 0Deep Learning Enabled Semantic Communications with Speech Recognition and Synthesis May 9, 2022 Deep Learning Semantic Communication
Code Code Available 1NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality May 9, 2022 Sentence Speech Synthesis
Code Code Available 2ReCAB-VAE: Gumbel-Softmax Variational Inference Based on Analytic Divergence May 9, 2022 Speech Synthesis text-to-speech
— Unverified 0SVTS: Scalable Video-to-Speech Synthesis May 4, 2022 Speech Synthesis
Code Code Available 1Attentive activation function for improving end-to-end spoofing countermeasure systems May 3, 2022 Speech Synthesis Voice Conversion
— Unverified 0Requirements and Motivations of Low-Resource Speech Synthesis for Language Revitalization May 1, 2022 Speech Synthesis
Code Code Available 1Systematic Inequalities in Language Technology Performance across the World’s Languages May 1, 2022 Dependency Parsing Machine Translation
Code Code Available 0Improving Self-Supervised Learning-based MOS Prediction Networks Apr 23, 2022 Prediction Quantization
Code Code Available 0FastDiff: A Fast Conditional Diffusion Model for High-Quality Speech Synthesis Apr 21, 2022 Denoising GPU
Code Code Available 2A Survey on Non-Autoregressive Generation for Neural Machine Translation and Beyond Apr 20, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1Exploration strategies for articulatory synthesis of complex syllable onsets Apr 20, 2022 Speech Synthesis
Code Code Available 0A Post Auto-regressive GAN Vocoder Focused on Spectrum Fracture Apr 12, 2022 Speech Synthesis
— Unverified 0Fine-grained Noise Control for Multispeaker Speech Synthesis Apr 11, 2022 Expressive Speech Synthesis Speech Synthesis
— Unverified 0The PartialSpoof Database and Countermeasures for the Detection of Short Fake Speech Segments Embedded in an Utterance Apr 11, 2022 Speaker Verification Speech Synthesis
— Unverified 0Unsupervised Quantized Prosody Representation for Controllable Speech Synthesis Apr 7, 2022 Quantization Speech Synthesis
— Unverified 0MAESTRO: Matched Speech Text Representations through Modality Matching Apr 7, 2022 Language Modelling Self-Supervised Learning
— Unverified 0Self-supervised learning for robust voice cloning Apr 7, 2022 Self-Supervised Learning Speech Synthesis
— Unverified 0DDOS: A MOS Prediction Framework utilizing Domain Adaptive Pre-training and Distribution of Opinion Scores Apr 7, 2022 Self-Supervised Learning Speech Synthesis
— Unverified 0SOMOS: The Samsung Open MOS Dataset for the Evaluation of Neural Text-to-Speech Synthesis Apr 6, 2022 Speech Synthesis text-to-speech
— Unverified 0Simple and Effective Unsupervised Speech Synthesis Apr 6, 2022 speech-recognition Speech Recognition
— Unverified 0A Comparison of Deep Learning MOS Predictors for Speech Synthesis Quality Apr 5, 2022 Benchmarking Self-Supervised Learning
— Unverified 0Lip to Speech Synthesis with Visual Context Attentional GAN Apr 4, 2022 Contrastive Learning Generative Adversarial Network
Code Code Available 1VQTTS: High-Fidelity Text-to-Speech Synthesis with Self-Supervised VQ Acoustic Feature Apr 2, 2022 Speech Synthesis text-to-speech
— Unverified 0Universal Adaptor: Converting Mel-Spectrograms Between Different Configurations for Speech Synthesis Apr 1, 2022 Speech Synthesis Voice Conversion
Code Code Available 0Residual-guided Personalized Speech Synthesis based on Face Image Apr 1, 2022 Speech Synthesis
— Unverified 0AdaSpeech 4: Adaptive Text to Speech in Zero-Shot Scenarios Apr 1, 2022 Speech Synthesis text-to-speech
— Unverified 0WavThruVec: Latent speech representation as intermediate features for neural speech synthesis Mar 31, 2022 Speech Synthesis text-to-speech
— Unverified 0ASR data augmentation in low-resource settings using cross-lingual multi-speaker TTS and cross-lingual voice conversion Mar 29, 2022 Automatic Speech Recognition Automatic Speech Recognition (ASR)
Code Code Available 1Applying Syntaxx2013Prosody Mapping Hypothesis and Prosodic Well-Formedness Constraints to Neural Sequence-to-Sequence Speech Synthesis Mar 29, 2022 Speech Synthesis text-to-speech
— Unverified 0