| EH-MAM: Easy-to-Hard Masked Acoustic Modeling for Self-Supervised Speech Representation Learning | Oct 17, 2024 | Representation LearningSelf-Supervised Learning | CodeCode Available | 1 |
| LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT | Mar 29, 2022 | AllAutomatic Speech Recognition | CodeCode Available | 1 |
| Multi-Task Corrupted Prediction for Learning Robust Audio-Visual Speech Representation | Jan 23, 2025 | Audio-Visual Speech RecognitionMulti-Task Learning | CodeCode Available | 1 |
| QS-TTS: Towards Semi-Supervised Text-to-Speech Synthesis via Vector-Quantized Self-Supervised Speech Representation Learning | Aug 31, 2023 | Representation LearningSpeech Representation Learning | CodeCode Available | 1 |
| data2vec-aqc: Search for the right Teaching Assistant in the Teacher-Student training setup | Nov 2, 2022 | Automatic Speech Recognition (ASR)Language Modeling | CodeCode Available | 1 |
| A^3T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editing | Mar 18, 2022 | Representation LearningSpeaker Verification | CodeCode Available | 1 |
| CLARA: Multilingual Contrastive Learning for Audio Representation Acquisition | Oct 18, 2023 | Audio ClassificationContrastive Learning | CodeCode Available | 1 |
| DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning | May 17, 2023 | ClusteringLanguage Modeling | CodeCode Available | 1 |
| FaceXHuBERT: Text-less Speech-driven E(X)pressive 3D Facial Animation Synthesis Using Self-Supervised Speech Representation Learning | Mar 9, 2023 | 3D Face AnimationRepresentation Learning | CodeCode Available | 1 |
| DeCoAR 2.0: Deep Contextualized Acoustic Representations with Vector Quantization | Dec 11, 2020 | DiversityQuantization | CodeCode Available | 1 |