Speech Representation Learning

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–50 of 131 papers

Title	Date	Tasks	Status	Hype	Score
ERNIE-SAT: Speech and Text Joint Pretraining for Cross-Lingual Multi-Speaker Text-to-Speech	Nov 7, 2022	Representation LearningSpeech Representation Learning	CodeCode Available	6	5
W2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-Training	Aug 7, 2021	Contrastive LearningLanguage Modeling	CodeCode Available	3	5
Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction	Jan 5, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	2	5
Robust Self-Supervised Audio-Visual Speech Recognition	Jan 5, 2022	Audio-Visual Speech RecognitionAutomatic Speech Recognition	CodeCode Available	2	5
TranSpeech: Speech-to-Speech Translation With Bilateral Perturbation	May 25, 2022	Representation LearningRhythm	CodeCode Available	1	5
EH-MAM: Easy-to-Hard Masked Acoustic Modeling for Self-Supervised Speech Representation Learning	Oct 17, 2024	Representation LearningSelf-Supervised Learning	CodeCode Available	1	5
data2vec-aqc: Search for the right Teaching Assistant in the Teacher-Student training setup	Nov 2, 2022	Automatic Speech Recognition (ASR)Language Modeling	CodeCode Available	1	5
FaceXHuBERT: Text-less Speech-driven E(X)pressive 3D Facial Animation Synthesis Using Self-Supervised Speech Representation Learning	Mar 9, 2023	3D Face AnimationRepresentation Learning	CodeCode Available	1	5
QS-TTS: Towards Semi-Supervised Text-to-Speech Synthesis via Vector-Quantized Self-Supervised Speech Representation Learning	Aug 31, 2023	Representation LearningSpeech Representation Learning	CodeCode Available	1	5
Robust Disentangled Variational Speech Representation Learning for Zero-shot Voice Conversion	Mar 30, 2022	Data AugmentationDecoder	CodeCode Available	1	5
The Efficacy of Self-Supervised Speech Models for Audio Representations	Sep 26, 2022	Onset DetectionPitch Classification	CodeCode Available	1	5
Unsupervised speech representation learning using WaveNet autoencoders	Jan 25, 2019	Acoustic Unit DiscoveryDecoder	CodeCode Available	1	5
DeCoAR 2.0: Deep Contextualized Acoustic Representations with Vector Quantization	Dec 11, 2020	DiversityQuantization	CodeCode Available	1	5
Multi-Task Corrupted Prediction for Learning Robust Audio-Visual Speech Representation	Jan 23, 2025	Audio-Visual Speech RecognitionMulti-Task Learning	CodeCode Available	1	5
CLARA: Multilingual Contrastive Learning for Audio Representation Acquisition	Oct 18, 2023	Audio ClassificationContrastive Learning	CodeCode Available	1	5
MT4SSL: Boosting Self-Supervised Speech Representation Learning by Integrating Multiple Targets	Nov 14, 2022	Automatic Speech RecognitionMulti-Task Learning	CodeCode Available	1	5
Self-Supervised Syllable Discovery Based on Speaker-Disentangled HuBERT	Sep 16, 2024	Acoustic Unit DiscoveryClustering	CodeCode Available	1	5
An Unsupervised Autoregressive Model for Speech Representation Learning	Apr 5, 2019	General Classificationmodel	CodeCode Available	1	5
Structured Pruning of Self-Supervised Pre-trained Models for Speech Recognition and Understanding	Feb 27, 2023	Model CompressionRepresentation Learning	CodeCode Available	1	5
Robust Data2vec: Noise-robust Speech Representation Learning for ASR by Combining Regression and Improved Contrastive Learning	Oct 27, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	1	5
Fast-HuBERT: An Efficient Training Framework for Self-Supervised Speech Representation Learning	Sep 25, 2023	Representation LearningSelf-Supervised Learning	CodeCode Available	1	5
Speech SIMCLR: Combining Contrastive and Reconstruction Objective for Self-supervised Speech Representation Learning	Oct 27, 2020	Emotion RecognitionRepresentation Learning	CodeCode Available	1	5
A^3T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editing	Mar 18, 2022	Representation LearningSpeaker Verification	CodeCode Available	1	5
The Effect of Batch Size on Contrastive Self-Supervised Speech Representation Learning	Feb 21, 2024	BenchmarkingRepresentation Learning	CodeCode Available	1	5
UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Data	Jan 19, 2021	Multi-Task LearningRepresentation Learning	CodeCode Available	1	5
UniSpeech-SAT: Universal Speech Representation Learning with Speaker Aware Pre-Training	Oct 12, 2021	Data AugmentationMulti-Task Learning	CodeCode Available	1	5
DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning	May 17, 2023	ClusteringLanguage Modeling	CodeCode Available	1	5
Using Radio Archives for Low-Resource Speech Recognition: Towards an Intelligent Virtual Assistant for Illiterate Users	Apr 27, 2021	Language IdentificationRepresentation Learning	CodeCode Available	1	5
XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale	Nov 17, 2021	Language IdentificationRepresentation Learning	CodeCode Available	1	5
Supervised Speech Representation Learning for Parkinson's Disease Classification	Jun 1, 2021	ClassificationRepresentation Learning	CodeCode Available	1	5
SLICER: Learning universal audio representations using low-resource self-supervised pre-training	Nov 2, 2022	Audio ClassificationClustering	CodeCode Available	1	5
Fast Development of ASR in African Languages using Self Supervised Speech Representation Learning	Mar 16, 2021	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	1	5
LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT	Mar 29, 2022	AllAutomatic Speech Recognition	CodeCode Available	1	5
HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units	Jun 14, 2021	ClusteringLanguage Modelling	CodeCode Available	1	5
Automatic Data Augmentation Selection and Parametrization in Contrastive Self-Supervised Speech Representation Learning	Apr 8, 2022	Contrastive LearningData Augmentation	CodeCode Available	0	5
A multimodal dynamical variational autoencoder for audiovisual speech representation learning	May 5, 2023	DenoisingDisentanglement	CodeCode Available	0	5
Deep Neural Convolutive Matrix Factorization for Articulatory Representation Decomposition	Apr 1, 2022	Phoneme RecognitionRepresentation Learning	CodeCode Available	0	5
Sampling strategies in Siamese Networks for unsupervised speech representation learning	Apr 30, 2018	Representation LearningSpeech Representation Learning	CodeCode Available	0	5
Pretext Tasks selection for multitask self-supervised speech representation learning	Jul 1, 2021	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	0	5
Conditional independence for pretext task selection in Self-supervised speech representation learning	Apr 15, 2021	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	0	5
An Efficient End-to-End Approach to Noise Invariant Speech Features via Multi-Task Learning	Mar 13, 2024	DenoisingKnowledge Distillation	CodeCode Available	0	5
MUST&P-SRL: Multi-lingual and Unified Syllabification in Text and Phonetic Domains for Speech Representation Learning	Oct 17, 2023	DisentanglementRepresentation Learning	CodeCode Available	0	5
A low latency attention module for streaming self-supervised speech representation learning	Feb 27, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	0	5
mHuBERT-147: A Compact Multilingual HuBERT Model	Jun 10, 2024	Automatic Speech Recognition (ASR)Diversity	CodeCode Available	0	5
DistilHuBERT: Speech Representation Learning by Layer-wise Distillation of Hidden-unit BERT	Oct 5, 2021	Multi-Task LearningRepresentation Learning	CodeCode Available	0	5
Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer Encoders	Oct 25, 2019	General ClassificationRepresentation Learning	CodeCode Available	0	5
PADA: Pruning Assisted Domain Adaptation for Self-Supervised Speech Representations	Mar 31, 2022	Domain AdaptationLanguage Modelling	CodeCode Available	0	5
Disentangled Speech Representation Learning for One-Shot Cross-lingual Voice Conversion Using β-VAE	Oct 25, 2022	DisentanglementRepresentation Learning	—Unverified	0	0
Disentangled Speech Representation Learning Based on Factorized Hierarchical Variational Autoencoder with Self-Supervised Objective	Apr 5, 2022	DisentanglementRepresentation Learning	—Unverified	0	0
Disentangled Feature Learning for Real-Time Neural Speech Coding	Nov 22, 2022	DisentanglementRepresentation Learning	—Unverified	0	0

Show:10 25 50

← PrevPage 1 of 3Next →

No leaderboard results yet.