Keyword Spotting

In speech processing, keyword spotting deals with the identification of keywords in utterances.

( Image credit: Simon Grest )

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 126–150 of 407 papers

Title	Date	Tasks	Status	Hype
How Tiny Can Analog Filterbank Features Be Made for Ultra-low-power On-device Keyword Spotting?	Apr 17, 2023	Keyword Spotting	—Unverified	0
Unsupervised Speech Representation Pooling Using Vector Quantization	Apr 8, 2023	Emotion Recognitionintent-classification	CodeCode Available	0
To Wake-up or Not to Wake-up: Reducing Keyword False Alarm by Successive Refinement	Apr 6, 2023	Keyword Spotting	—Unverified	0
AraSpot: Arabic Spoken Command Spotting	Mar 29, 2023	Data AugmentationKeyword Spotting	CodeCode Available	0
Exploring Representation Learning for Small-Footprint Keyword Spotting	Mar 20, 2023	Contrastive LearningKeyword Spotting	—Unverified	0
Self-supervised speech representation learning for keyword-spotting with light-weight transformers	Mar 7, 2023	Keyword SpottingRepresentation Learning	—Unverified	0
ST-KeyS: Self-Supervised Transformer for Keyword Spotting in Historical Handwritten Documents	Mar 6, 2023	Keyword SpottingSelf-Supervised Learning	—Unverified	0
Fixed-point quantization aware training for on-device keyword-spotting	Mar 4, 2023	Keyword SpottingQuantization	—Unverified	0
Scalable Weight Reparametrization for Efficient Transfer Learning	Feb 26, 2023	Keyword SpottingTransfer Learning	—Unverified	0
Locale Encoding For Scalable Multilingual Keyword Spotting Models	Feb 25, 2023	Keyword Spotting	—Unverified	0
Speech Privacy Leakage from Shared Gradients in Distributed Learning	Feb 21, 2023	Federated LearningKeyword Spotting	—Unverified	0
LipLearner: Customizable Silent Speech Interactions on Mobile Devices	Feb 12, 2023	Contrastive LearningIncremental Learning	CodeCode Available	1
A Comparison of Temporal Encoders for Neuromorphic Keyword Spotting with Few Neurons	Jan 24, 2023	Binary ClassificationKeyword Spotting	—Unverified	0
Analyzing the Representational Geometry of Acoustic Word Embeddings	Jan 8, 2023	Keyword SpottingWord Embeddings	—Unverified	0
VSVC: Backdoor attack against Keyword Spotting based on Voiceprint Selection and Voice Conversion	Dec 20, 2022	Backdoor AttackKeyword Spotting	—Unverified	0
Learnable Front Ends Based on Temporal Modulation for Music Tagging	Nov 28, 2022	Keyword SpottingMusic Tagging	—Unverified	0
ASiT: Local-Global Audio Spectrogram vIsion Transformer for Event Classification	Nov 23, 2022	Keyword SpottingSelf-Supervised Learning	CodeCode Available	1
Filterbank Learning for Noise-Robust Small-Footprint Keyword Spotting	Nov 19, 2022	Keyword SpottingSmall-Footprint Keyword Spotting	—Unverified	0
PBSM: Backdoor attack against Keyword spotting based on pitch boosting and sound masking	Nov 16, 2022	Backdoor AttackKeyword Spotting	—Unverified	0
BiFSMNv2: Pushing Binary Neural Networks for Keyword Spotting to Real-Network Performance	Nov 13, 2022	BinarizationKeyword Spotting	CodeCode Available	1
Exploring Sequence-to-Sequence Transformer-Transducer Models for Keyword Spotting	Nov 11, 2022	Keyword Spotting	—Unverified	0
LiCo-Net: Linearized Convolution Network for Hardware-efficient Keyword Spotting	Nov 9, 2022	Keyword Spotting	—Unverified	0
Integrated Parameter-Efficient Tuning for General-Purpose Audio Models	Nov 4, 2022	Genre classificationKeyword Spotting	CodeCode Available	0
Harnessing the Power of Explanations for Incremental Training: A LIME-Based Approach	Nov 2, 2022	Feature ImportanceIncremental Learning	—Unverified	0
MAST: Multiscale Audio Spectrogram Transformers	Nov 2, 2022	Audio ClassificationKeyword Spotting	CodeCode Available	1

Show:10 25 50

← PrevPage 6 of 17Next →

All datasets QUESST Google Speech Commands hey Siri FKD Google Speech Commands V2 35 TensorFlow VoxForge Google Speech Commands (v2)Google Speech Commands V2 12

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	NNI non-filtered(for the development set)	Cnxe	6.09	—	Unverified
2	NNI Choi(for the development set)	Cnxe	5.89	—	Unverified
3	NTU rnn (eval)	Cnxe	2.01	—	Unverified
4	NTU dtw (eval)	Cnxe	2.01	—	Unverified
5	NTU dtw (dev)	Cnxe	2.01	—	Unverified
6	NTU rnn (dev)	Cnxe	2.01	—	Unverified
7	ELiRF SDTW (eval)	Cnxe	1.19	—	Unverified
8	ELiRF SDTW-avg (eval)	Cnxe	1.07	—	Unverified
9	ELiRF SDTW (dev)	Cnxe	1.07	—	Unverified
10	CUNY [Subseq+MFCC] (eval)	Cnxe	1.07	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	WaveFormer	Google Speech Commands V2 12	98.8	—	Unverified
2	QNN	Google Speech Commands V2 35	98.6	—	Unverified
3	TripletLoss-res15	Google Speech Commands V1 12	98.56	—	Unverified
4	M2D	Google Speech Commands V2 35	98.5	—	Unverified
5	EAT-S	Google Speech Commands V2 35	98.15	—	Unverified
6	Audio Spectrogram Transformer	Google Speech Commands V2 35	98.11	—	Unverified
7	EdgeCRNN 2.0×	Google Speech Commands V2 12	98.05	—	Unverified
8	BC-ResNet-8	Google Speech Commands V1 12	98	—	Unverified
9	HTS-AT	Google Speech Commands V2 35	98	—	Unverified
10	Wav2KWS	Google Speech Commands V1 12	97.9	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Stacked 1D CNN	Error Rate	1.99	—	Unverified
2	End-to-end DNN-HMM	Error Rate	1.7	—	Unverified
3	HEiMDaL	Error Rate	0.45	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Res26	Accuracy	95.88	—	Unverified
2	EfficientNet-A0 + SA + TL	Accuracy	95.83	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	QuaternionNeuralNetwork	Accuracy (10-fold)	98.53	—	Unverified
2	SSAMBA	Accuracy (10-fold)	97.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TensorFlow's model version 2	TFMA	89.7	—	Unverified
2	TensorFlow's model version 1	TFMA	85.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	2D-ConvNet	Accuracy (%)	95.4	—	Unverified
2	1D-ConvNet	Accuracy (%)	93.7	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Quaternion Neural Networks	Accuracy(10-fold)	98.53	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	MicroNet-KWS-L	Accuracy	95.3	—	Unverified