How Accents Confound: Probing for Accent Information in End-to-End Speech Recognition Systems

2020-07-01ACL 2020Unverified0· sign in to hype

Archiki Prasad, Preethi Jyothi

Unverified — Be the first to reproduce this paper.

Abstract

In this work, we present a detailed analysis of how accent information is reflected in the internal representation of speech in an end-to-end automatic speech recognition (ASR) system. We use a state-of-the-art end-to-end ASR system, comprising convolutional and recurrent layers, that is trained on a large amount of US-accented English speech and evaluate the model on speech samples from seven different English accents. We examine the effects of accent on the internal representation using three main probing techniques: a) Gradient-based explanation methods, b) Information-theoretic measures, and c) Outputs of accent and phone classifiers. We find different accents exhibiting similar trends irrespective of the probing technique used. We also find that most accent information is encoded within the first recurrent layer, which is suggestive of how one could adapt such an end-to-end model to learn representations that are invariant to accents.

Tasks

Automatic Speech Recognition Automatic Speech Recognition (ASR)speech-recognition Speech Recognition

How Accents Confound: Probing for Accent Information in End-to-End Speech Recognition Systems

Abstract

Tasks

Reproductions