Speech Representation Learning

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 51–100 of 131 papers

Title	Date	Tasks	Status	Hype
VATLM: Visual-Audio-Text Pre-Training with Unified Masked Prediction for Speech Representation Learning	Nov 21, 2022	Audio-Visual Speech RecognitionLanguage Modelling	—Unverified	0
MT4SSL: Boosting Self-Supervised Speech Representation Learning by Integrating Multiple Targets	Nov 14, 2022	Automatic Speech RecognitionMulti-Task Learning	CodeCode Available	1
Improving the Robustness of DistilHuBERT to Unseen Noisy Conditions via Data Augmentation, Curriculum Learning, and Multi-Task Enhancement	Nov 12, 2022	Data AugmentationEmotion Recognition	—Unverified	0
ERNIE-SAT: Speech and Text Joint Pretraining for Cross-Lingual Multi-Speaker Text-to-Speech	Nov 7, 2022	Representation LearningSpeech Representation Learning	CodeCode Available	6
SLICER: Learning universal audio representations using low-resource self-supervised pre-training	Nov 2, 2022	Audio ClassificationClustering	CodeCode Available	1
data2vec-aqc: Search for the right Teaching Assistant in the Teacher-Student training setup	Nov 2, 2022	Automatic Speech Recognition (ASR)Language Modeling	CodeCode Available	1
Application of Knowledge Distillation to Multi-task Speech Representation Learning	Oct 29, 2022	Keyword SpottingKnowledge Distillation	—Unverified	0
Robust Data2vec: Noise-robust Speech Representation Learning for ASR by Combining Regression and Improved Contrastive Learning	Oct 27, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	1
Disentangled Speech Representation Learning for One-Shot Cross-lingual Voice Conversion Using β-VAE	Oct 25, 2022	DisentanglementRepresentation Learning	—Unverified	0
Improving Speech Representation Learning via Speech-level and Phoneme-level Masking Approach	Oct 25, 2022	Representation LearningSpeaker Recognition	—Unverified	0
SUPERB @ SLT 2022: Challenge on Generalization and Efficiency of Self-Supervised Speech Representation Learning	Oct 16, 2022	Audio GenerationRepresentation Learning	—Unverified	0
Experiments on Turkish ASR with Self-Supervised Speech Representation Learning	Oct 13, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
On the Use of Semantically-Aligned Speech Representations for Spoken Language Understanding	Oct 11, 2022	Representation LearningSentence	—Unverified	0
The Efficacy of Self-Supervised Speech Models for Audio Representations	Sep 26, 2022	Onset DetectionPitch Classification	CodeCode Available	1
Unsupervised TTS Acoustic Modeling for TTS with Conditional Disentangled Sequential VAE	Jun 6, 2022	Representation LearningSpeech Representation Learning	—Unverified	0
Self-supervised models of audio effectively explain human cortical responses to speech	May 27, 2022	Representation LearningSpeech Representation Learning	—Unverified	0
TranSpeech: Speech-to-Speech Translation With Bilateral Perturbation	May 25, 2022	Representation LearningRhythm	CodeCode Available	1
Self-Supervised Speech Representation Learning: A Review	May 21, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
SAMU-XLSR: Semantically-Aligned Multimodal Utterance-level Cross-Lingual Speech Representation	May 17, 2022	Representation LearningRetrieval	—Unverified	0
Automatic Data Augmentation Selection and Parametrization in Contrastive Self-Supervised Speech Representation Learning	Apr 8, 2022	Contrastive LearningData Augmentation	CodeCode Available	0
Automatic Pronunciation Assessment using Self-Supervised Speech Representation Learning	Apr 8, 2022	Representation LearningSelf-Supervised Learning	—Unverified	0
Disentangled Speech Representation Learning Based on Factorized Hierarchical Variational Autoencoder with Self-Supervised Objective	Apr 5, 2022	DisentanglementRepresentation Learning	—Unverified	0
Deep Neural Convolutive Matrix Factorization for Articulatory Representation Decomposition	Apr 1, 2022	Phoneme RecognitionRepresentation Learning	CodeCode Available	0
PADA: Pruning Assisted Domain Adaptation for Self-Supervised Speech Representations	Mar 31, 2022	Domain AdaptationLanguage Modelling	CodeCode Available	0
Robust Disentangled Variational Speech Representation Learning for Zero-shot Voice Conversion	Mar 30, 2022	Data AugmentationDecoder	CodeCode Available	1
LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT	Mar 29, 2022	AllAutomatic Speech Recognition	CodeCode Available	1
Robust Speaker Recognition with Transformers Using wav2vec 2.0	Mar 28, 2022	Data AugmentationRepresentation Learning	—Unverified	0
Leveraging unsupervised and weakly-supervised data to improve direct speech-to-speech translation	Mar 24, 2022	Representation LearningSpeech Representation Learning	—Unverified	0
XTREME-S: Evaluating Cross-lingual Speech Representations	Mar 21, 2022	Representation LearningRetrieval	—Unverified	0
A^3T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editing	Mar 18, 2022	Representation LearningSpeaker Verification	CodeCode Available	1
Privacy-Preserving Speech Representation Learning using Vector Quantization	Mar 15, 2022	Privacy PreservingQuantization	—Unverified	0
Language Adaptive Cross-lingual Speech Representation Learning with Sparse Sharing Sub-networks	Mar 9, 2022	Representation Learningspeech-recognition	—Unverified	0
A Brief Overview of Unsupervised Neural Speech Representation Learning	Mar 1, 2022	Representation LearningSpeech Representation Learning	—Unverified	0
A Noise-Robust Self-supervised Pre-training Model Based Speech Representation Learning for Automatic Speech Recognition	Jan 22, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
A Deep Paradigm for Articulatory Speech Representation Learning via Neural Convolutive Sparse Matrix Factorization	Jan 16, 2022	Phoneme RecognitionRepresentation Learning	—Unverified	0
Robust Self-Supervised Audio-Visual Speech Recognition	Jan 5, 2022	Audio-Visual Speech RecognitionAutomatic Speech Recognition	CodeCode Available	2
Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction	Jan 5, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	2
Robust Speech Representation Learning via Flow-based Embedding Regularization	Dec 7, 2021	Deep LearningLanguage Identification	—Unverified	0
XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale	Nov 17, 2021	Language IdentificationRepresentation Learning	CodeCode Available	1
Characterizing the adversarial vulnerability of speech self-supervised learning	Nov 8, 2021	Adversarial RobustnessBenchmarking	—Unverified	0
Improving Noise Robustness of Contrastive Speech Representation Learning with Speech Reconstruction	Oct 28, 2021	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
Speech Representation Learning Through Self-supervised Pretraining And Multi-task Finetuning	Oct 18, 2021	Multi-Task LearningRepresentation Learning	—Unverified	0
Conformer-Based Self-Supervised Learning for Non-Speech Audio Tasks	Oct 14, 2021	Audio ClassificationRepresentation Learning	—Unverified	0
UniSpeech-SAT: Universal Speech Representation Learning with Speaker Aware Pre-Training	Oct 12, 2021	Data AugmentationMulti-Task Learning	CodeCode Available	1
DistilHuBERT: Speech Representation Learning by Layer-wise Distillation of Hidden-unit BERT	Oct 5, 2021	Multi-Task LearningRepresentation Learning	CodeCode Available	0
W2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-Training	Aug 7, 2021	Contrastive LearningLanguage Modeling	CodeCode Available	3
An Adapter Based Pre-Training for Efficient and Scalable Self-Supervised Speech Representation Learning	Jul 26, 2021	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
Pretext Tasks selection for multitask self-supervised speech representation learning	Jul 1, 2021	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	0
HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units	Jun 14, 2021	ClusteringLanguage Modelling	CodeCode Available	1
PARP: Prune, Adjust and Re-Prune for Self-Supervised Speech Recognition	Jun 10, 2021	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0

Show:10 25 50

← PrevPage 2 of 3Next →

No leaderboard results yet.