Word Error Rate Estimation for Speech Recognition: e-WER

2018-07-01ACL 2018Code Available1· sign in to hype

Ahmed Ali, Steve Renals

Code Available — Be the first to reproduce this paper.

Code

github.com/qcri/e-wer
OfficialIn papernone★ 16

Abstract

Measuring the performance of automatic speech recognition (ASR) systems requires manually transcribed data in order to compute the word error rate (WER), which is often time-consuming and expensive. In this paper, we propose a novel approach to estimate WER, or e-WER, which does not require a gold-standard transcription of the test set. Our e-WER framework uses a comprehensive set of features: ASR recognised text, character recognition results to complement recognition output, and internal decoder features. We report results for the two features; black-box and glass-box using unseen 24 Arabic broadcast programs. Our system achieves 16.9\% WER root mean squared error (RMSE) across 1,400 sentences. The estimated overall WER e-WER was 25.3\% for the three hours test set, while the actual WER was 28.5\%.

Tasks

Automatic Speech Recognition Automatic Speech Recognition (ASR)Decoder Language Modeling Language Modelling Machine Translation speech-recognition Speech Recognition Word Embeddings

Word Error Rate Estimation for Speech Recognition: e-WER

Code

Abstract

Tasks

Reproductions