Fast Development of ASR in African Languages using Self Supervised Speech Representation Learning

2021-03-16Code Available1· sign in to hype

Jama Hussein Mohamud, Lloyd Acquaye Thompson, Aissatou Ndoye, Laurent Besacier

Code Available — Be the first to reproduce this paper.

Code

github.com/besacier/AMMIcourse
OfficialIn papernone★ 43

Abstract

This paper describes the results of an informal collaboration launched during the African Master of Machine Intelligence (AMMI) in June 2020. After a series of lectures and labs on speech data collection using mobile applications and on self-supervised representation learning from speech, a small group of students and the lecturer continued working on automatic speech recognition (ASR) project for three languages: Wolof, Ga, and Somali. This paper describes how data was collected and ASR systems developed with a small amount (1h) of transcribed speech as training data. In these low resource conditions, pre-training a model on large amounts of raw speech was fundamental for the efficiency of ASR systems developed.

Tasks

Automatic Speech Recognition Automatic Speech Recognition (ASR)Representation Learning speech-recognition Speech Recognition Speech Representation Learning

Fast Development of ASR in African Languages using Self Supervised Speech Representation Learning

Code

Abstract

Tasks

Reproductions