SOTAVerified

Spot keywords from very noisy and mixed speech

2023-05-28Unverified0· sign in to hype

Ying Shi, Dong Wang, Lantian Li, Jiqing Han, Shi Yin

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

Most existing keyword spotting research focuses on conditions with slight or moderate noise. In this paper, we try to tackle a more challenging task: detecting keywords buried under strong interfering speech (10 times higher than the keyword in amplitude), and even worse, mixed with other keywords. We propose a novel Mix Training (MT) strategy that encourages the model to discover low-energy keywords from noisy and mixed speech. Experiments were conducted with a vanilla CNN and two EfficientNet (B0/B2) architectures. The results evaluated with the Google Speech Command dataset demonstrated that the proposed mix training approach is highly effective and outperforms standard data augmentation and mixup training.

Tasks

Reproductions