Training Neural Speech Recognition Systems with Synthetic Speech Augmentation

2018-10-22Unverified0· sign in to hype

Anonymous

Unverified — Be the first to reproduce this paper.

Abstract

Building an accurate automatic speech recognition (ASR) system requires a large dataset that contains many hours of labeled speech samples produced by a diverse set of speakers. The lack of such open free datasets is one of the main issues preventing advancements in ASR research. To address this problem, we propose to augment a natural speech dataset with synthetic speech. We train very large end-to-end neural speech recognition models using the LibriSpeech dataset augmented with synthetic speech. These new models achieve state of the art Word Error Rate (WER) for character-level based models without an external language model.

Tasks

Automatic Speech Recognition Automatic Speech Recognition (ASR)Language Modeling Language Modelling speech-recognition Speech Recognition

Training Neural Speech Recognition Systems with Synthetic Speech Augmentation

Abstract

Tasks

Reproductions