SOTAVerified

Hello Edge: Keyword Spotting on Microcontrollers

2017-11-20Code Available0· sign in to hype

Yundong Zhang, Naveen Suda, Liangzhen Lai, Vikas Chandra

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Keyword spotting (KWS) is a critical component for enabling speech based user interactions on smart devices. It requires real-time response and high accuracy for good user experience. Recently, neural networks have become an attractive choice for KWS architecture because of their superior accuracy compared to traditional speech processing algorithms. Due to its always-on nature, KWS application has highly constrained power budget and typically runs on tiny microcontrollers with limited memory and compute capability. The design of neural network architecture for KWS must consider these constraints. In this work, we perform neural network architecture evaluation and exploration for running KWS on resource-constrained microcontrollers. We train various neural network architectures for keyword spotting published in literature to compare their accuracy and memory/compute requirements. We show that it is possible to optimize these neural network architectures to fit within the memory and compute constraints of microcontrollers without sacrificing accuracy. We further explore the depthwise separable convolutional neural network (DS-CNN) and compare it against other neural network architectures. DS-CNN achieves an accuracy of 95.4%, which is ~10% higher than the DNN model with similar number of parameters.

Tasks

Benchmark Results

DatasetModelMetricClaimedVerifiedStatus
Google Speech CommandsDS-CNNGoogle Speech Commands V1 1294.4Unverified
Google Speech CommandsGRUGoogle Speech Commands V1 1293.5Unverified
Google Speech CommandsBasic LSTMGoogle Speech Commands V1 1292Unverified
Google Speech CommandsDNNGoogle Speech Commands V1 1291.6Unverified
Google Speech CommandsCNNGoogle Speech Commands V1 1284.6Unverified

Reproductions