Vietnamese Automatic Speech Recognition using Wav2vec 2.0

2022-05-08https://huggingface.co/khanhld/wav2vec2-base-vietnamese-160h 2022Code Available1· sign in to hype

Le Duy Khanh

Code Available — Be the first to reproduce this paper.

Code

github.com/khanld/ASR-Wa2vec-Finetune
Officialpytorch★ 149

Abstract

We fine-tuned the Wav2vec2-based model on about 160 hours of Vietnamese speech dataset from different resources, including VIOS, COMMON VOICE, FOSD, and VLSP (100h) using Connectionist Temporal Classification (CTC). As a result, we gain 10.78% and 15.05% (without Language Model) WER on COMMON VOICE and VIOS datasets, respectively.

Tasks

Automatic Speech Recognition Automatic Speech Recognition (ASR)Language Modeling Language Modelling speech-recognition Speech Recognition

Vietnamese Automatic Speech Recognition using Wav2vec 2.0

Code

Abstract

Tasks

Reproductions