SOTAVerified

Keyword Spotting

In speech processing, keyword spotting deals with the identification of keywords in utterances.

( Image credit: Simon Grest )

Papers

Showing 150 of 407 papers

TitleStatusHype
PaddleSpeech: An Easy-to-Use All-in-One Speech ToolkitCode6
MFA-KWS: Effective Keyword Spotting with Multi-head Frame-asynchronous DecodingCode2
AST: Audio Spectrogram TransformerCode2
SSAST: Self-Supervised Audio Spectrogram TransformerCode2
SSAMBA: Self-Supervised Audio Representation Learning with Mamba State Space ModelCode2
TDT-KWS: Fast And Accurate Keyword Spotting Using Token-and-duration TransducerCode2
Training Keyword Spotters with Limited and Synthesized Speech DataCode2
WeKws: A production first small-footprint end-to-end Keyword Spotting ToolkitCode2
HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and DetectionCode2
GLAP: General contrastive audio-text pretraining across domains and languagesCode2
Streaming Keyword Spotting Boosted by Cross-layer Discrimination ConsistencyCode2
Small-Footprint Keyword Spotting with Multi-Scale Temporal ConvolutionCode1
Speech Commands: A Dataset for Limited-Vocabulary Speech RecognitionCode1
Sparse Binarization for Fast Keyword SpottingCode1
Reduced Precision Floating-Point Optimization for Deep Neural Network On-Device Learning on MicroControllersCode1
Improving Label-Deficient Keyword Spotting Through Self-Supervised PretrainingCode1
Rainbow Keywords: Efficient Incremental Learning for Online Spoken Keyword SpottingCode1
Seeing wake words: Audio-visual Keyword SpottingCode1
AfriHate: A Multilingual Collection of Hate Speech and Abusive Language Datasets for African LanguagesCode1
EfficientNet-Absolute Zero for Continuous Speech Keyword SpottingCode1
MicroNets: Neural Network Architectures for Deploying TinyML Applications on Commodity MicrocontrollersCode1
The taste of IPA: Towards open-vocabulary keyword spotting and forced alignment in any languageCode1
Few-Shot Keyword Spotting With Prototypical NetworksCode1
Progressive Continual Learning for Spoken Keyword SpottingCode1
Few-Shot Open-Set Learning for On-Device Customization of KeyWord Spotting SystemsCode1
End-to-End Audio Strikes Back: Boosting Augmentations Towards An Efficient Audio Classification NetworkCode1
Self-Learning for Personalized Keyword Spotting on Ultra-Low-Power Audio SensorsCode1
SiDi KWS: A Large-Scale Multilingual Dataset for Keyword SpottingCode1
MLPerf Tiny BenchmarkCode1
Howl: A Deployed, Open-Source Wake Word Detection SystemCode1
Phoneme Boundary Detection using Learnable Segmental FeaturesCode1
Learning Efficient Representations for Keyword Spotting with Triplet LossCode1
BiFSMN: Binary Neural Network for Keyword SpottingCode1
BiFSMNv2: Pushing Binary Neural Networks for Keyword Spotting to Real-Network PerformanceCode1
Learning Audio-Text Agreement for Open-vocabulary Keyword SpottingCode1
LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERTCode1
BSL-1K: Scaling up co-articulated sign language recognition using mouthing cuesCode1
Broadcasted Residual Learning for Efficient Keyword SpottingCode1
Deep Residual Learning for Small-Footprint Keyword SpottingCode1
Decentralizing Feature Extraction with Quantum Convolutional Neural Network for Automatic Speech RecognitionCode1
MAST: Multiscale Audio Spectrogram TransformersCode1
An End-to-End Architecture for Keyword Spotting and Voice Activity DetectionCode1
Auto-KWS 2021 Challenge: Task, Datasets, and BaselinesCode1
ASiT: Local-Global Audio Spectrogram vIsion Transformer for Event ClassificationCode1
ED-sKWS: Early-Decision Spiking Neural Networks for Rapid,and Energy-Efficient Keyword SpottingCode1
MM-KWS: Multi-modal Prompts for Multilingual User-defined Keyword SpottingCode1
Keyword Spotting System and Evaluation of Pruning and Quantization Methods on Low-power Edge MicrocontrollersCode1
Chameleon: A MatMul-Free Temporal Convolutional Network Accelerator for End-to-End Few-Shot and Continual Learning from Sequential DataCode1
Few-Shot Keyword Spotting in Any LanguageCode1
Attention-Free Keyword SpottingCode1
Show:102550
← PrevPage 1 of 9Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1NNI non-filtered(for the development set)Cnxe6.09Unverified
2NNI Choi(for the development set)Cnxe5.89Unverified
3NTU rnn (eval)Cnxe2.01Unverified
4NTU dtw (eval)Cnxe2.01Unverified
5NTU dtw (dev)Cnxe2.01Unverified
6NTU rnn (dev)Cnxe2.01Unverified
7ELiRF SDTW (eval)Cnxe1.19Unverified
8ELiRF SDTW-avg (eval)Cnxe1.07Unverified
9ELiRF SDTW (dev)Cnxe1.07Unverified
10CUNY [Subseq+MFCC] (eval)Cnxe1.07Unverified
#ModelMetricClaimedVerifiedStatus
1WaveFormerGoogle Speech Commands V2 1298.8Unverified
2QNNGoogle Speech Commands V2 3598.6Unverified
3TripletLoss-res15Google Speech Commands V1 1298.56Unverified
4M2DGoogle Speech Commands V2 3598.5Unverified
5EAT-SGoogle Speech Commands V2 3598.15Unverified
6Audio Spectrogram TransformerGoogle Speech Commands V2 3598.11Unverified
7EdgeCRNN 2.0×Google Speech Commands V2 1298.05Unverified
8BC-ResNet-8Google Speech Commands V1 1298Unverified
9HTS-ATGoogle Speech Commands V2 3598Unverified
10Wav2KWSGoogle Speech Commands V1 1297.9Unverified
#ModelMetricClaimedVerifiedStatus
1Stacked 1D CNNError Rate1.99Unverified
2End-to-end DNN-HMMError Rate1.7Unverified
3HEiMDaLError Rate0.45Unverified
#ModelMetricClaimedVerifiedStatus
1Res26Accuracy95.88Unverified
2EfficientNet-A0 + SA + TLAccuracy95.83Unverified
#ModelMetricClaimedVerifiedStatus
1QuaternionNeuralNetworkAccuracy (10-fold)98.53Unverified
2SSAMBAAccuracy (10-fold)97.4Unverified
#ModelMetricClaimedVerifiedStatus
1TensorFlow's model version 2TFMA89.7Unverified
2TensorFlow's model version 1TFMA85.4Unverified
#ModelMetricClaimedVerifiedStatus
12D-ConvNetAccuracy (%)95.4Unverified
21D-ConvNetAccuracy (%)93.7Unverified
#ModelMetricClaimedVerifiedStatus
1Quaternion Neural NetworksAccuracy(10-fold)98.53Unverified
#ModelMetricClaimedVerifiedStatus
1MicroNet-KWS-LAccuracy95.3Unverified