Data-Driven Mispronunciation Pattern Discovery for Robust Speech Recognition

2025-02-01Unverified0· sign in to hype

Anna Seo Gyeong Choi, JongHyeon Park, Myungwoo Oh

Unverified — Be the first to reproduce this paper.

Abstract

Recent advancements in machine learning have significantly improved speech recognition, but recognizing speech from non-fluent or accented speakers remains a challenge. Previous efforts, relying on rule-based pronunciation patterns, have struggled to fully capture non-native errors. We propose two data-driven approaches using speech corpora to automatically detect mispronunciation patterns. By aligning non-native phones with their native counterparts using attention maps, we achieved a 5.7% improvement in speech recognition on native English datasets and a 12.8% improvement for non-native English speakers, particularly Korean speakers. Our method offers practical advancements for robust Automatic Speech Recognition (ASR) systems particularly for situations where prior linguistic knowledge is not applicable.

Tasks

Automatic Speech Recognition Automatic Speech Recognition (ASR)Robust Speech Recognition speech-recognition Speech Recognition

Data-Driven Mispronunciation Pattern Discovery for Robust Speech Recognition

Abstract

Tasks

Reproductions