SOTAVerified

Speech Representation Learning

Papers

Showing 125 of 131 papers

TitleStatusHype
ERNIE-SAT: Speech and Text Joint Pretraining for Cross-Lingual Multi-Speaker Text-to-SpeechCode6
W2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-TrainingCode3
Learning Audio-Visual Speech Representation by Masked Multimodal Cluster PredictionCode2
Robust Self-Supervised Audio-Visual Speech RecognitionCode2
data2vec-aqc: Search for the right Teaching Assistant in the Teacher-Student training setupCode1
Speech SIMCLR: Combining Contrastive and Reconstruction Objective for Self-supervised Speech Representation LearningCode1
Structured Pruning of Self-Supervised Pre-trained Models for Speech Recognition and UnderstandingCode1
HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden UnitsCode1
Robust Data2vec: Noise-robust Speech Representation Learning for ASR by Combining Regression and Improved Contrastive LearningCode1
Fast-HuBERT: An Efficient Training Framework for Self-Supervised Speech Representation LearningCode1
DeCoAR 2.0: Deep Contextualized Acoustic Representations with Vector QuantizationCode1
LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERTCode1
Self-Supervised Syllable Discovery Based on Speaker-Disentangled HuBERTCode1
SLICER: Learning universal audio representations using low-resource self-supervised pre-trainingCode1
CLARA: Multilingual Contrastive Learning for Audio Representation AcquisitionCode1
Multi-Task Corrupted Prediction for Learning Robust Audio-Visual Speech RepresentationCode1
DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation LearningCode1
MT4SSL: Boosting Self-Supervised Speech Representation Learning by Integrating Multiple TargetsCode1
FaceXHuBERT: Text-less Speech-driven E(X)pressive 3D Facial Animation Synthesis Using Self-Supervised Speech Representation LearningCode1
A^3T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and EditingCode1
An Unsupervised Autoregressive Model for Speech Representation LearningCode1
Fast Development of ASR in African Languages using Self Supervised Speech Representation LearningCode1
EH-MAM: Easy-to-Hard Masked Acoustic Modeling for Self-Supervised Speech Representation LearningCode1
QS-TTS: Towards Semi-Supervised Text-to-Speech Synthesis via Vector-Quantized Self-Supervised Speech Representation LearningCode1
Robust Disentangled Variational Speech Representation Learning for Zero-shot Voice ConversionCode1
Show:102550
← PrevPage 1 of 6Next →

No leaderboard results yet.