Keyword Spotting

In speech processing, keyword spotting deals with the identification of keywords in utterances.

( Image credit: Simon Grest )

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 351–400 of 407 papers

Title	Date	Tasks	Status
Split Federated Learning on Micro-controllers: A Keyword Spotting Showcase	Oct 4, 2022	Federated LearningKeyword Spotting	—Unverified
Spoken Language Identification using ConvNets	Oct 9, 2019	Keyword SpottingLanguage Identification	—Unverified
Let SSMs be ConvNets: State-space Modeling with Optimal Tensor Contractions	Jan 22, 2025	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available
Honkling: In-Browser Personalization for Ubiquitous Keyword Spotting	Nov 1, 2019	Keyword Spotting	CodeCode Available
Keyword Spotting Simplified: A Segmentation-Free Approach using Character Counting and CTC re-scoring	Aug 7, 2023	Keyword Spottingobject-detection	CodeCode Available
Stochastic Adaptive Neural Architecture Search for Keyword Spotting	Nov 16, 2018	Keyword SpottingNeural Architecture Search	CodeCode Available
End-to-end Keyword Spotting using Xception-1d	Oct 9, 2021	Keyword Spotting	CodeCode Available
Efficient keyword spotting using dilated convolutions and gating	Nov 19, 2018	Keyword SpottingSmall-Footprint Keyword Spotting	CodeCode Available
Honk: A PyTorch Reimplementation of Convolutional Neural Networks for Keyword Spotting	Oct 18, 2017	Keyword Spottingspeech-recognition	CodeCode Available
Building and benchmarking an Arabic Speech Commands dataset for small-footprint keyword spotting	May 7, 2021	BenchmarkingDeep Learning	CodeCode Available
Masked Modeling Duo: Learning Representations by Encouraging Both Networks to Model the Input	Oct 26, 2022	Audio ClassificationAudio Tagging	CodeCode Available
Keyword localisation in untranscribed speech using visually grounded speech models	Feb 2, 2022	Keyword SpottingTAG	CodeCode Available
Semi-Supervised Federated Learning for Keyword Spotting	May 9, 2023	Federated LearningKeyword Spotting	CodeCode Available
AraSpot: Arabic Spoken Command Spotting	Mar 29, 2023	Data AugmentationKeyword Spotting	CodeCode Available
Efficient Keyword Spotting by capturing long-range interactions with Temporal Lambda Networks	Apr 16, 2021	Keyword Spottingspeech-recognition	CodeCode Available
An Investigation of Few-Shot Learning in Spoken Term Classification	Dec 26, 2018	Few-Shot LearningGeneral Classification	CodeCode Available
What’s Cookin’? Interpreting Cooking Videos using Text, Speech and Vision	May 1, 2015	Keyword Spotting	CodeCode Available
Small-Footprint Keyword Spotting on Raw Audio Data with Sinc-Convolutions	Nov 5, 2019	Keyword SpottingSmall-Footprint Keyword Spotting	CodeCode Available
Hello Edge: Keyword Spotting on Microcontrollers	Nov 20, 2017	Keyword Spotting	CodeCode Available
JavaScript Convolutional Neural Networks for Keyword Spotting in the Browser: An Experimental Analysis	Oct 30, 2018	Keyword SpottingModel Compression	CodeCode Available
Integrated Parameter-Efficient Tuning for General-Purpose Audio Models	Nov 4, 2022	Genre classificationKeyword Spotting	CodeCode Available
What's Cookin'? Interpreting Cooking Videos using Text, Speech and Vision	Mar 5, 2015	Keyword Spotting	CodeCode Available
Indian EmoSpeech Command Dataset: A dataset for emotion based speech recognition in the wild	Oct 18, 2019	Emotion RecognitionKeyword Spotting	CodeCode Available
ImportantAug: a data augmentation agent for speech	Dec 14, 2021	Data AugmentationKeyword Spotting	CodeCode Available
GTM-UVigo Systems for the Query-by-Example Search on Speech Task at MediaEval 2015	Sep 14, 2015	DecoderDynamic Time Warping	CodeCode Available
Audiomer: A Convolutional Transformer For Keyword Spotting	Sep 21, 2021	Keyword Spotting	CodeCode Available
ed-cec: improving rare word recognition using asr postprocessing based on error detection and context-aware error correction	Oct 8, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available
The NPU System for the 2020 Personalized Voice Trigger Challenge	Feb 26, 2021	Keyword SpottingSmall-Footprint Keyword Spotting	CodeCode Available
Audio Explanation Synthesis with Generative Foundation Models	Oct 10, 2024	BenchmarkingDecision Making	CodeCode Available
DONUT: CTC-based Query-by-Example Keyword Spotting	Nov 26, 2018	Keyword Spotting	CodeCode Available
Learning Delays Through Gradients and Structure: Emergence of Spatiotemporal Patterns in Spiking Neural Networks	Jul 7, 2024	Keyword Spotting	CodeCode Available
Filler Word Detection and Classification: A Dataset and Benchmark	Mar 28, 2022	ClassificationKeyword Spotting	CodeCode Available
Advances in Small-Footprint Keyword Spotting: A Comprehensive Review of Efficient Models and Algorithms	Jun 12, 2025	Automatic Speech RecognitionKeyword Spotting	CodeCode Available
Boosting keyword spotting through on-device learnable user speech characteristics	Mar 12, 2024	Few-Shot LearningKeyword Spotting	CodeCode Available
Distilled Non-Semantic Speech Embeddings with Binary Neural Networks for Low-Resource Devices	Jul 12, 2022	Emotion RecognitionKeyword Spotting	CodeCode Available
Trainable Frontend For Robust and Far-Field Keyword Spotting	Jul 19, 2016	Keyword Spottingspeech-recognition	CodeCode Available
TACos: Learning Temporally Structured Embeddings for Few-Shot Keyword Spotting with Dynamic Time Warping	May 18, 2023	Dynamic Time WarpingKeyword Spotting	CodeCode Available
Temporal Convolution for Real-time Keyword Spotting on Mobile Devices	Apr 8, 2019	Keyword Spotting	CodeCode Available
Temporal Feedback Convolutional Recurrent Neural Networks for Speech Command Recognition	Oct 30, 2019	Keyword Spotting	CodeCode Available
Attention-based End-to-End Models for Small-Footprint Keyword Spotting	Mar 29, 2018	Keyword SpottingSmall-Footprint Keyword Spotting	CodeCode Available
Federated Learning for Keyword Spotting	Oct 9, 2018	Federated LearningKeyword Spotting	CodeCode Available
Neural ODE with Temporal Convolution and Time Delay Neural Networks for Small-Footprint Keyword Spotting	Aug 1, 2020	Keyword SpottingSmall-Footprint Keyword Spotting	CodeCode Available
Neuromorphic Keyword Spotting with Pulse Density Modulation MEMS Microphones	Aug 9, 2024	Keyword Spotting	CodeCode Available
READ-BAD: A New Dataset and Evaluation Scheme for Baseline Detection in Archival Documents	May 9, 2017	BinarizationKeyword Spotting	CodeCode Available
Benchmarking Keyword Spotting Efficiency on Neuromorphic Hardware	Dec 4, 2018	BenchmarkingCPU	CodeCode Available
Noise-Robust Keyword Spotting through Self-supervised Pretraining	Mar 27, 2024	DenoisingKeyword Spotting	CodeCode Available
Tiny, always-on and fragile: Bias propagation through design choices in on-device machine learning workflows	Jan 19, 2022	Keyword Spotting	CodeCode Available
Evaluating Sequence-to-Sequence Models for Handwritten Text Recognition	Mar 18, 2019	DecoderHandwritten Text Recognition	CodeCode Available
Adversarial Example Detection by Classification for Deep Speech Recognition	Oct 22, 2019	ClassificationGeneral Classification	CodeCode Available
What is Learnt by the LEArnable Front-end (LEAF)? Adapting Per-Channel Energy Normalisation (PCEN) to Noisy Conditions	Apr 10, 2024	Emotion RecognitionKeyword Spotting	CodeCode Available

Show:10 25 50

← PrevPage 8 of 9Next →

All datasets QUESST Google Speech Commands hey Siri FKD Google Speech Commands V2 35 TensorFlow VoxForge Google Speech Commands (v2)Google Speech Commands V2 12

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	NNI non-filtered(for the development set)	Cnxe	6.09	—	Unverified
2	NNI Choi(for the development set)	Cnxe	5.89	—	Unverified
3	NTU rnn (eval)	Cnxe	2.01	—	Unverified
4	NTU dtw (eval)	Cnxe	2.01	—	Unverified
5	NTU dtw (dev)	Cnxe	2.01	—	Unverified
6	NTU rnn (dev)	Cnxe	2.01	—	Unverified
7	ELiRF SDTW (eval)	Cnxe	1.19	—	Unverified
8	ELiRF SDTW-avg (eval)	Cnxe	1.07	—	Unverified
9	ELiRF SDTW (dev)	Cnxe	1.07	—	Unverified
10	CUNY [Subseq+MFCC] (eval)	Cnxe	1.07	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	WaveFormer	Google Speech Commands V2 12	98.8	—	Unverified
2	QNN	Google Speech Commands V2 35	98.6	—	Unverified
3	TripletLoss-res15	Google Speech Commands V1 12	98.56	—	Unverified
4	M2D	Google Speech Commands V2 35	98.5	—	Unverified
5	EAT-S	Google Speech Commands V2 35	98.15	—	Unverified
6	Audio Spectrogram Transformer	Google Speech Commands V2 35	98.11	—	Unverified
7	EdgeCRNN 2.0×	Google Speech Commands V2 12	98.05	—	Unverified
8	BC-ResNet-8	Google Speech Commands V1 12	98	—	Unverified
9	HTS-AT	Google Speech Commands V2 35	98	—	Unverified
10	Wav2KWS	Google Speech Commands V1 12	97.9	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Stacked 1D CNN	Error Rate	1.99	—	Unverified
2	End-to-end DNN-HMM	Error Rate	1.7	—	Unverified
3	HEiMDaL	Error Rate	0.45	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Res26	Accuracy	95.88	—	Unverified
2	EfficientNet-A0 + SA + TL	Accuracy	95.83	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	QuaternionNeuralNetwork	Accuracy (10-fold)	98.53	—	Unverified
2	SSAMBA	Accuracy (10-fold)	97.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TensorFlow's model version 2	TFMA	89.7	—	Unverified
2	TensorFlow's model version 1	TFMA	85.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	2D-ConvNet	Accuracy (%)	95.4	—	Unverified
2	1D-ConvNet	Accuracy (%)	93.7	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Quaternion Neural Networks	Accuracy(10-fold)	98.53	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MicroNet-KWS-L	Accuracy	95.3	—	Unverified