Speech Representation Learning

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–50 of 131 papers

Title	Date	Tasks	Status	Hype
HYFuse: Aligning Heterogeneous Speech Pre-Trained Representations in Hyperbolic Space for Speech Emotion Recognition	Jun 3, 2025	Emotion RecognitionRepresentation Learning	—Unverified	0
DuRep: Dual-Mode Speech Representation Learning via ASR-Aware Distillation	May 26, 2025	Representation LearningSpeech Representation Learning	—Unverified	0
Universal Semantic Disentangled Privacy-preserving Speech Representation Learning	May 19, 2025	DecoderPrivacy Preserving	—Unverified	0
UniWav: Towards Unified Pre-training for Speech Representation Learning and Generation	Mar 2, 2025	DecoderRepresentation Learning	—Unverified	0
Multi-Task Corrupted Prediction for Learning Robust Audio-Visual Speech Representation	Jan 23, 2025	Audio-Visual Speech RecognitionMulti-Task Learning	CodeCode Available	1
k2SSL: A Faster and Better Framework for Self-Supervised Speech Representation Learning	Nov 26, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
EH-MAM: Easy-to-Hard Masked Acoustic Modeling for Self-Supervised Speech Representation Learning	Oct 17, 2024	Representation LearningSelf-Supervised Learning	CodeCode Available	1
JOOCI: a Framework for Learning Comprehensive Speech Representations	Oct 14, 2024	Representation LearningSpeech Representation Learning	—Unverified	0
Are Music Foundation Models Better at Singing Voice Deepfake Detection? Far-Better Fuse them with Speech Foundation Models	Sep 21, 2024	DeepFake DetectionFace Swapping	—Unverified	0
Self-Supervised Syllable Discovery Based on Speaker-Disentangled HuBERT	Sep 16, 2024	Acoustic Unit DiscoveryClustering	CodeCode Available	1
Progressive Residual Extraction based Pre-training for Speech Representation Learning	Aug 31, 2024	Emotion RecognitionRepresentation Learning	—Unverified	0
Speech Representation Learning Revisited: The Necessity of Separate Learnable Parameters and Robust Data Augmentation	Aug 20, 2024	Data AugmentationRepresentation Learning	—Unverified	0
Towards the Next Frontier in Speech Representation Learning Using Disentanglement	Jul 2, 2024	DisentanglementRepresentation Learning	—Unverified	0
Towards Robust Speech Representation Learning for Thousands of Languages	Jun 30, 2024	Representation LearningSelf-Supervised Learning	—Unverified	0
Emotion-Aware Speech Self-Supervised Representation Learning with Intensity Knowledge	Jun 10, 2024	Representation LearningSelf-Supervised Learning	—Unverified	0
mHuBERT-147: A Compact Multilingual HuBERT Model	Jun 10, 2024	Automatic Speech Recognition (ASR)Diversity	CodeCode Available	0
XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech Perception	Mar 21, 2024	Audio-Visual Speech RecognitionRepresentation Learning	—Unverified	0
An Efficient End-to-End Approach to Noise Invariant Speech Features via Multi-Task Learning	Mar 13, 2024	DenoisingKnowledge Distillation	CodeCode Available	0
The Effect of Batch Size on Contrastive Self-Supervised Speech Representation Learning	Feb 21, 2024	BenchmarkingRepresentation Learning	CodeCode Available	1
UNIT-DSR: Dysarthric Speech Reconstruction System Using Speech Unit Normalization	Jan 26, 2024	DecoderDomain Adaptation	—Unverified	0
Revisiting Self-supervised Learning of Speech Representation from a Mutual Information Perspective	Jan 16, 2024	Representation LearningSelf-Supervised Learning	—Unverified	0
Efficiency-oriented approaches for self-supervised speech representation learning	Dec 18, 2023	Automatic Speech RecognitionRepresentation Learning	—Unverified	0
Reimagining Speech: A Scoping Review of Deep Learning-Powered Voice Conversion	Nov 14, 2023	Deep LearningDiversity	—Unverified	0
Learning Disentangled Speech Representations	Nov 4, 2023	BenchmarkingDisentanglement	—Unverified	0
Privacy-preserving Representation Learning for Speech Understanding	Oct 26, 2023	ClassificationEmotion Recognition	—Unverified	0
CLARA: Multilingual Contrastive Learning for Audio Representation Acquisition	Oct 18, 2023	Audio ClassificationContrastive Learning	CodeCode Available	1
MUST&P-SRL: Multi-lingual and Unified Syllabification in Text and Phonetic Domains for Speech Representation Learning	Oct 17, 2023	DisentanglementRepresentation Learning	CodeCode Available	0
Spatial HuBERT: Self-supervised Spatial Speech Representation Learning for a Single Talker from Multi-channel Audio	Oct 17, 2023	Representation LearningSelf-Supervised Learning	—Unverified	0
Evaluating Self-Supervised Speech Representations for Indigenous American Languages	Oct 5, 2023	Representation LearningSpeech Representation Learning	—Unverified	0
Fast-HuBERT: An Efficient Training Framework for Self-Supervised Speech Representation Learning	Sep 25, 2023	Representation LearningSelf-Supervised Learning	CodeCode Available	1
QS-TTS: Towards Semi-Supervised Text-to-Speech Synthesis via Vector-Quantized Self-Supervised Speech Representation Learning	Aug 31, 2023	Representation LearningSpeech Representation Learning	CodeCode Available	1
Speech representation learning: Learning bidirectional encoders with single-view, multi-view, and multi-task methods	Jul 25, 2023	MULTI-VIEW LEARNINGRepresentation Learning	—Unverified	0
MASR: Multi-label Aware Speech Representation	Jul 20, 2023	Emotion RecognitionLanguage Identification	—Unverified	0
On-Device Constrained Self-Supervised Speech Representation Learning for Keyword Spotting via Knowledge Distillation	Jul 6, 2023	Keyword SpottingKnowledge Distillation	—Unverified	0
Flowchase: a Mobile Application for Pronunciation Training	Jul 5, 2023	Representation LearningSpeech Representation Learning	—Unverified	0
Label Aware Speech Representation Learning For Language Identification	Jun 7, 2023	Language IdentificationMissing Labels	—Unverified	0
Simultaneous or Sequential Training? How Speech Representations Cooperate in a Multi-Task Self-Supervised Learning System	Jun 5, 2023	Multi-Task LearningRepresentation Learning	—Unverified	0
An empirical study on speech restoration guided by self supervised speech representation	May 30, 2023	Representation LearningSpeech Representation Learning	—Unverified	0
INTapt: Information-Theoretic Adversarial Prompt Tuning for Enhanced Non-Native Speech Recognition	May 25, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
TranUSR: Phoneme-to-word Transcoder Based Unified Speech Representation Learning for Cross-lingual Speech Recognition	May 23, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning	May 17, 2023	ClusteringLanguage Modeling	CodeCode Available	1
A multimodal dynamical variational autoencoder for audiovisual speech representation learning	May 5, 2023	DenoisingDisentanglement	CodeCode Available	0
Learning Cross-lingual Visual Speech Representations	Mar 14, 2023	Representation LearningSelf-Supervised Learning	—Unverified	0
FaceXHuBERT: Text-less Speech-driven E(X)pressive 3D Facial Animation Synthesis Using Self-Supervised Speech Representation Learning	Mar 9, 2023	3D Face AnimationRepresentation Learning	CodeCode Available	1
Self-supervised speech representation learning for keyword-spotting with light-weight transformers	Mar 7, 2023	Keyword SpottingRepresentation Learning	—Unverified	0
Structured Pruning of Self-Supervised Pre-trained Models for Speech Recognition and Understanding	Feb 27, 2023	Model CompressionRepresentation Learning	CodeCode Available	1
A low latency attention module for streaming self-supervised speech representation learning	Feb 27, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	0
Efficient Speech Representation Learning with Low-Bit Quantization	Dec 14, 2022	Model CompressionQuantization	—Unverified	0
Improved Self-Supervised Multilingual Speech Representation Learning Combined with Auxiliary Language Information	Dec 7, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
Disentangled Feature Learning for Real-Time Neural Speech Coding	Nov 22, 2022	DisentanglementRepresentation Learning	—Unverified	0

Show:10 25 50

← PrevPage 1 of 3Next →

No leaderboard results yet.