Vietnamese Automatic Speech Recognition using Wav2vec 2.0
2022-05-08https://huggingface.co/khanhld/wav2vec2-base-vietnamese-160h 2022Code Available1· sign in to hype
Le Duy Khanh
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/khanld/ASR-Wa2vec-FinetuneOfficialpytorch★ 149
Abstract
We fine-tuned the Wav2vec2-based model on about 160 hours of Vietnamese speech dataset from different resources, including VIOS, COMMON VOICE, FOSD, and VLSP (100h) using Connectionist Temporal Classification (CTC). As a result, we gain 10.78% and 15.05% (without Language Model) WER on COMMON VOICE and VIOS datasets, respectively.