Building and benchmarking an Arabic Speech Commands dataset for small-footprint keyword spotting
Abdulkader Ghandoura, Farouk Hjabo, Oumayma Al Dakkak
Code Available — Be the first to reproduce this paper.
ReproduceCode
Abstract
The introduction of the Google Speech Commands dataset accelerated research and resulted in a variety of new deep learning approaches that address keyword spotting tasks. The main contribution of this work is the building of an Arabic Speech Commands dataset, a counterpart to Google’s dataset. Our dataset consists of 12,000 instances collected from 30 contributors and grouped into 40 keywords. We also report different experiments to benchmark this dataset using classical machine learning and deep learning approaches, the best of which is a Convolutional Neural Network with Mel-Frequency Cepstral Coefficients that achieved an accuracy of ∼98%. Additionally, we point out some key ideas to be considered in such tasks.