| VATLM: Visual-Audio-Text Pre-Training with Unified Masked Prediction for Speech Representation Learning | Nov 21, 2022 | Audio-Visual Speech RecognitionLanguage Modelling | —Unverified | 0 |
| MT4SSL: Boosting Self-Supervised Speech Representation Learning by Integrating Multiple Targets | Nov 14, 2022 | Automatic Speech RecognitionMulti-Task Learning | CodeCode Available | 1 |
| Improving the Robustness of DistilHuBERT to Unseen Noisy Conditions via Data Augmentation, Curriculum Learning, and Multi-Task Enhancement | Nov 12, 2022 | Data AugmentationEmotion Recognition | —Unverified | 0 |
| ERNIE-SAT: Speech and Text Joint Pretraining for Cross-Lingual Multi-Speaker Text-to-Speech | Nov 7, 2022 | Representation LearningSpeech Representation Learning | CodeCode Available | 6 |
| SLICER: Learning universal audio representations using low-resource self-supervised pre-training | Nov 2, 2022 | Audio ClassificationClustering | CodeCode Available | 1 |
| data2vec-aqc: Search for the right Teaching Assistant in the Teacher-Student training setup | Nov 2, 2022 | Automatic Speech Recognition (ASR)Language Modeling | CodeCode Available | 1 |
| Application of Knowledge Distillation to Multi-task Speech Representation Learning | Oct 29, 2022 | Keyword SpottingKnowledge Distillation | —Unverified | 0 |
| Robust Data2vec: Noise-robust Speech Representation Learning for ASR by Combining Regression and Improved Contrastive Learning | Oct 27, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Disentangled Speech Representation Learning for One-Shot Cross-lingual Voice Conversion Using β-VAE | Oct 25, 2022 | DisentanglementRepresentation Learning | —Unverified | 0 |
| Improving Speech Representation Learning via Speech-level and Phoneme-level Masking Approach | Oct 25, 2022 | Representation LearningSpeaker Recognition | —Unverified | 0 |
| SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning | Oct 16, 2022 | Audio GenerationRepresentation Learning | —Unverified | 0 |
| Experiments on Turkish ASR with Self-Supervised Speech Representation Learning | Oct 13, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| On the Use of Semantically-Aligned Speech Representations for Spoken Language Understanding | Oct 11, 2022 | Representation LearningSentence | —Unverified | 0 |
| The Efficacy of Self-Supervised Speech Models for Audio Representations | Sep 26, 2022 | Onset DetectionPitch Classification | CodeCode Available | 1 |
| Unsupervised TTS Acoustic Modeling for TTS with Conditional Disentangled Sequential VAE | Jun 6, 2022 | Representation LearningSpeech Representation Learning | —Unverified | 0 |
| Self-supervised models of audio effectively explain human cortical responses to speech | May 27, 2022 | Representation LearningSpeech Representation Learning | —Unverified | 0 |
| TranSpeech: Speech-to-Speech Translation With Bilateral Perturbation | May 25, 2022 | Representation LearningRhythm | CodeCode Available | 1 |
| Self-Supervised Speech Representation Learning: A Review | May 21, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| SAMU-XLSR: Semantically-Aligned Multimodal Utterance-level Cross-Lingual Speech Representation | May 17, 2022 | Representation LearningRetrieval | —Unverified | 0 |
| Automatic Data Augmentation Selection and Parametrization in Contrastive Self-Supervised Speech Representation Learning | Apr 8, 2022 | Contrastive LearningData Augmentation | CodeCode Available | 0 |
| Automatic Pronunciation Assessment using Self-Supervised Speech Representation Learning | Apr 8, 2022 | Representation LearningSelf-Supervised Learning | —Unverified | 0 |
| Disentangled Speech Representation Learning Based on Factorized Hierarchical Variational Autoencoder with Self-Supervised Objective | Apr 5, 2022 | DisentanglementRepresentation Learning | —Unverified | 0 |
| Deep Neural Convolutive Matrix Factorization for Articulatory Representation Decomposition | Apr 1, 2022 | Phoneme RecognitionRepresentation Learning | CodeCode Available | 0 |
| PADA: Pruning Assisted Domain Adaptation for Self-Supervised Speech Representations | Mar 31, 2022 | Domain AdaptationLanguage Modelling | CodeCode Available | 0 |
| Robust Disentangled Variational Speech Representation Learning for Zero-shot Voice Conversion | Mar 30, 2022 | Data AugmentationDecoder | CodeCode Available | 1 |
| LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT | Mar 29, 2022 | AllAutomatic Speech Recognition | CodeCode Available | 1 |
| Robust Speaker Recognition with Transformers Using wav2vec 2.0 | Mar 28, 2022 | Data AugmentationRepresentation Learning | —Unverified | 0 |
| Leveraging unsupervised and weakly-supervised data to improve direct speech-to-speech translation | Mar 24, 2022 | Representation LearningSpeech Representation Learning | —Unverified | 0 |
| XTREME-S: Evaluating Cross-lingual Speech Representations | Mar 21, 2022 | Representation LearningRetrieval | —Unverified | 0 |
| A^3T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editing | Mar 18, 2022 | Representation LearningSpeaker Verification | CodeCode Available | 1 |
| Privacy-Preserving Speech Representation Learning using Vector Quantization | Mar 15, 2022 | Privacy PreservingQuantization | —Unverified | 0 |
| Language Adaptive Cross-lingual Speech Representation Learning with Sparse Sharing Sub-networks | Mar 9, 2022 | Representation Learningspeech-recognition | —Unverified | 0 |
| A Brief Overview of Unsupervised Neural Speech Representation Learning | Mar 1, 2022 | Representation LearningSpeech Representation Learning | —Unverified | 0 |
| A Noise-Robust Self-supervised Pre-training Model Based Speech Representation Learning for Automatic Speech Recognition | Jan 22, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| A Deep Paradigm for Articulatory Speech Representation Learning via Neural Convolutive Sparse Matrix Factorization | Jan 16, 2022 | Phoneme RecognitionRepresentation Learning | —Unverified | 0 |
| Robust Self-Supervised Audio-Visual Speech Recognition | Jan 5, 2022 | Audio-Visual Speech RecognitionAutomatic Speech Recognition | CodeCode Available | 2 |
| Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction | Jan 5, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 2 |
| Robust Speech Representation Learning via Flow-based Embedding Regularization | Dec 7, 2021 | Deep LearningLanguage Identification | —Unverified | 0 |
| XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale | Nov 17, 2021 | Language IdentificationRepresentation Learning | CodeCode Available | 1 |
| Characterizing the adversarial vulnerability of speech self-supervised learning | Nov 8, 2021 | Adversarial RobustnessBenchmarking | —Unverified | 0 |
| Improving Noise Robustness of Contrastive Speech Representation Learning with Speech Reconstruction | Oct 28, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Speech Representation Learning Through Self-supervised Pretraining And Multi-task Finetuning | Oct 18, 2021 | Multi-Task LearningRepresentation Learning | —Unverified | 0 |
| Conformer-Based Self-Supervised Learning for Non-Speech Audio Tasks | Oct 14, 2021 | Audio ClassificationRepresentation Learning | —Unverified | 0 |
| UniSpeech-SAT: Universal Speech Representation Learning with Speaker Aware Pre-Training | Oct 12, 2021 | Data AugmentationMulti-Task Learning | CodeCode Available | 1 |
| DistilHuBERT: Speech Representation Learning by Layer-wise Distillation of Hidden-unit BERT | Oct 5, 2021 | Multi-Task LearningRepresentation Learning | CodeCode Available | 0 |
| W2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-Training | Aug 7, 2021 | Contrastive LearningLanguage Modeling | CodeCode Available | 3 |
| An Adapter Based Pre-Training for Efficient and Scalable Self-Supervised Speech Representation Learning | Jul 26, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Pretext Tasks selection for multitask self-supervised speech representation learning | Jul 1, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 0 |
| HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units | Jun 14, 2021 | ClusteringLanguage Modelling | CodeCode Available | 1 |
| PARP: Prune, Adjust and Re-Prune for Self-Supervised Speech Recognition | Jun 10, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |