A 71.2-μW Speech Recognition Accelerator with Recurrent Spiking Neural Network

2025-03-27Unverified0· sign in to hype

Chih-Chyau Yang, Tian-Sheuan Chang

Unverified — Be the first to reproduce this paper.

Abstract

This paper introduces a 71.2-W speech recognition accelerator designed for edge devices' real-time applications, emphasizing an ultra low power design. Achieved through algorithm and hardware co-optimizations, we propose a compact recurrent spiking neural network with two recurrent layers, one fully connected layer, and a low time step (1 or 2). The 2.79-MB model undergoes pruning and 4-bit fixed-point quantization, shrinking it by 96.42\% to 0.1 MB. On the hardware front, we take advantage of mixed-level pruning, zero-skipping and merged spike techniques, reducing complexity by 90.49\% to 13.86 MMAC/S. The parallel time-step execution addresses inter-time-step data dependencies and enables weight buffer power savings through weight sharing. Capitalizing on the sparse spike activity, an input broadcasting scheme eliminates zero computations, further saving power. Implemented on the TSMC 28-nm process, the design operates in real time at 100 kHz, consuming 71.2 W, surpassing state-of-the-art designs. At 500 MHz, it has 28.41 TOPS/W and 1903.11 GOPS/mm^2 in energy and area efficiency, respectively.

Tasks

Quantization speech-recognition Speech Recognition

A 71.2-μW Speech Recognition Accelerator with Recurrent Spiking Neural Network

Abstract

Tasks

Reproductions