Speech Representation Learning

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 26–50 of 131 papers

Title	Date	Tasks	Status	Hype
An Unsupervised Autoregressive Model for Speech Representation Learning	Apr 5, 2019	General Classificationmodel	CodeCode Available	1
DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning	May 17, 2023	ClusteringLanguage Modeling	CodeCode Available	1
FaceXHuBERT: Text-less Speech-driven E(X)pressive 3D Facial Animation Synthesis Using Self-Supervised Speech Representation Learning	Mar 9, 2023	3D Face AnimationRepresentation Learning	CodeCode Available	1
LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT	Mar 29, 2022	AllAutomatic Speech Recognition	CodeCode Available	1
Robust Data2vec: Noise-robust Speech Representation Learning for ASR by Combining Regression and Improved Contrastive Learning	Oct 27, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	1
Speech SIMCLR: Combining Contrastive and Reconstruction Objective for Self-supervised Speech Representation Learning	Oct 27, 2020	Emotion RecognitionRepresentation Learning	CodeCode Available	1
DeCoAR 2.0: Deep Contextualized Acoustic Representations with Vector Quantization	Dec 11, 2020	DiversityQuantization	CodeCode Available	1
MT4SSL: Boosting Self-Supervised Speech Representation Learning by Integrating Multiple Targets	Nov 14, 2022	Automatic Speech RecognitionMulti-Task Learning	CodeCode Available	1
Multi-Task Corrupted Prediction for Learning Robust Audio-Visual Speech Representation	Jan 23, 2025	Audio-Visual Speech RecognitionMulti-Task Learning	CodeCode Available	1
Disentangled Speech Representation Learning for One-Shot Cross-lingual Voice Conversion Using β-VAE	Oct 25, 2022	DisentanglementRepresentation Learning	—Unverified	0
Disentangled Speech Representation Learning Based on Factorized Hierarchical Variational Autoencoder with Self-Supervised Objective	Apr 5, 2022	DisentanglementRepresentation Learning	—Unverified	0
A Comparison of Discrete Latent Variable Models for Speech Representation Learning	Oct 24, 2020	Phoneme RecognitionRepresentation Learning	—Unverified	0
Disentangled Feature Learning for Real-Time Neural Speech Coding	Nov 22, 2022	DisentanglementRepresentation Learning	—Unverified	0
ATCSpeechNet: A multilingual end-to-end speech recognition framework for air traffic control systems	Feb 17, 2021	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
A Brief Overview of Unsupervised Neural Speech Representation Learning	Mar 1, 2022	Representation LearningSpeech Representation Learning	—Unverified	0
Deep Representation Learning in Speech Processing: Challenges, Recent Advances, and Future Trends	Jan 2, 2020	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
Are Music Foundation Models Better at Singing Voice Deepfake Detection? Far-Better Fuse them with Speech Foundation Models	Sep 21, 2024	DeepFake DetectionFace Swapping	—Unverified	0
Adversarially learning disentangled speech representations for robust multi-factor voice conversion	Jan 30, 2021	Representation LearningRhythm	—Unverified	0
HYFuse: Aligning Heterogeneous Speech Pre-Trained Representations in Hyperbolic Space for Speech Emotion Recognition	Jun 3, 2025	Emotion RecognitionRepresentation Learning	—Unverified	0
Experiments on Turkish ASR with Self-Supervised Speech Representation Learning	Oct 13, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
Application of Knowledge Distillation to Multi-task Speech Representation Learning	Oct 29, 2022	Keyword SpottingKnowledge Distillation	—Unverified	0
Improving the Robustness of DistilHuBERT to Unseen Noisy Conditions via Data Augmentation, Curriculum Learning, and Multi-Task Enhancement	Nov 12, 2022	Data AugmentationEmotion Recognition	—Unverified	0
Improving Unsupervised Subword Modeling via Disentangled Speech Representation Learning and Transformation	Jun 17, 2019	ClusteringRepresentation Learning	—Unverified	0
Input-independent Attention Weights Are Expressive Enough: A Study of Attention in Self-supervised Audio Transformers	Jun 9, 2020	General ClassificationRepresentation Learning	—Unverified	0
General-Purpose Speech Representation Learning through a Self-Supervised Multi-Granularity Framework	Feb 3, 2021	ClassificationEmotion Classification	—Unverified	0

Show:10 25 50

← PrevPage 2 of 6Next →

No leaderboard results yet.