Unsupervised Speech Recognition with N-Skipgram and Positional Unigram Matching
2023-10-03Code Available1· sign in to hype
Liming Wang, Mark Hasegawa-Johnson, Chang D. Yoo
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/lwang114/graphunsupasrOfficialIn paperpytorch★ 10
Abstract
Training unsupervised speech recognition systems presents challenges due to GAN-associated instability, misalignment between speech and text, and significant memory demands. To tackle these challenges, we introduce a novel ASR system, ESPUM. This system harnesses the power of lower-order N-skipgrams (up to N=3) combined with positional unigram statistics gathered from a small batch of samples. Evaluated on the TIMIT benchmark, our model showcases competitive performance in ASR and phoneme segmentation tasks. Access our publicly available code at https://github.com/lwang114/GraphUnsupASR.