| QS-TTS: Towards Semi-Supervised Text-to-Speech Synthesis via Vector-Quantized Self-Supervised Speech Representation Learning | Aug 31, 2023 | Representation LearningSpeech Representation Learning | CodeCode Available | 1 |
| DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning | May 17, 2023 | ClusteringLanguage Modeling | CodeCode Available | 1 |
| FaceXHuBERT: Text-less Speech-driven E(X)pressive 3D Facial Animation Synthesis Using Self-Supervised Speech Representation Learning | Mar 9, 2023 | 3D Face AnimationRepresentation Learning | CodeCode Available | 1 |
| Structured Pruning of Self-Supervised Pre-trained Models for Speech Recognition and Understanding | Feb 27, 2023 | Model CompressionRepresentation Learning | CodeCode Available | 1 |
| MT4SSL: Boosting Self-Supervised Speech Representation Learning by Integrating Multiple Targets | Nov 14, 2022 | Automatic Speech RecognitionMulti-Task Learning | CodeCode Available | 1 |
| data2vec-aqc: Search for the right Teaching Assistant in the Teacher-Student training setup | Nov 2, 2022 | Automatic Speech Recognition (ASR)Language Modeling | CodeCode Available | 1 |
| SLICER: Learning universal audio representations using low-resource self-supervised pre-training | Nov 2, 2022 | Audio ClassificationClustering | CodeCode Available | 1 |
| Robust Data2vec: Noise-robust Speech Representation Learning for ASR by Combining Regression and Improved Contrastive Learning | Oct 27, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| The Efficacy of Self-Supervised Speech Models for Audio Representations | Sep 26, 2022 | Onset DetectionPitch Classification | CodeCode Available | 1 |
| TranSpeech: Speech-to-Speech Translation With Bilateral Perturbation | May 25, 2022 | Representation LearningRhythm | CodeCode Available | 1 |