PI-Whisper: Designing an Adaptive and Incremental Automatic Speech Recognition System for Edge Devices
Amir Nassereldine, Dancheng Liu, Chenhui Xu, Ruiyang Qin, Yiyu Shi, JinJun Xiong
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
Edge-based automatic speech recognition (ASR) technologies are increasingly prevalent in the development of intelligent and personalized assistants. However, resource-constrained ASR models face significant challenges in adaptivity, incrementality, and inclusivity when faced with a diverse population. To tackle those challenges, we propose PI-Whisper, a novel ASR system that adaptively enhances recognition capabilities by identifying speakers' characteristics in real-time. In this work, we show how the design of PI-Whisper allows for incremental adaptation of new characteristics without the need for repetitive retraining, enhances recognition capabilities, and improves equity and fairness across diverse speaker groups. PI-Whisper demonstrates these advantages by achieving state-of-the-art accuracy, reducing the word error rate (WER) by up to 13.7% relative to baselines while scaling linearly to computing resources.