TED-LIUM 3: twice as much data and corpus repartition for experiments on speaker adaptation

2018-05-12Code Available3· sign in to hype

François Hernandez, Vincent Nguyen, Sahar Ghannay, Natalia Tomashenko, Yannick Estève

Code Available — Be the first to reproduce this paper.

Code

github.com/kaldi-asr/kaldi/tree/master/egs/tedlium/s5_r3
Officialnone★ 0
github.com/huggingface/datasets
tf★ 21,322
github.com/mdangschat/speech-corpus-dl
none★ 0

Abstract

In this paper, we present TED-LIUM release 3 corpus dedicated to speech recognition in English, that multiplies by more than two the available data to train acoustic models in comparison with TED-LIUM 2. We present the recent development on Automatic Speech Recognition (ASR) systems in comparison with the two previous releases of the TED-LIUM Corpus from 2012 and 2014. We demonstrate that, passing from 207 to 452 hours of transcribed speech training data is really more useful for end-to-end ASR systems than for HMM-based state-of-the-art ones, even if the HMM-based ASR system still outperforms end-to-end ASR system when the size of audio training data is 452 hours, with respectively a Word Error Rate (WER) of 6.6% and 13.7%. Last, we propose two repartitions of the TED-LIUM release 3 corpus: the legacy one that is the same as the one existing in release 2, and a new one, calibrated and designed to make experiments on speaker adaptation. Like the two first releases, TED-LIUM 3 corpus will be freely available for the research community.

Tasks

Automatic Speech Recognition Automatic Speech Recognition (ASR)speech-recognition Speech Recognition

TED-LIUM 3: twice as much data and corpus repartition for experiments on speaker adaptation

Code

Abstract

Tasks

Reproductions