SG-VAD: Stochastic Gates Based Speech Activity Detection

2022-10-28Code Available1· sign in to hype

Jonathan Svirsky, Ofir Lindenbaum

Code Available — Be the first to reproduce this paper.

Code

github.com/jsvir/vad
OfficialIn paperpytorch★ 38

Abstract

We propose a novel voice activity detection (VAD) model in a low-resource environment. Our key idea is to model VAD as a denoising task, and construct a network that is designed to identify nuisance features for a speech classification task. We train the model to simultaneously identify irrelevant features while predicting the type of speech event. Our model contains only 7.8K parameters, outperforms the previously proposed methods on the AVA-Speech evaluation set, and provides comparative results on the HAVIC dataset. We present its architecture, experimental results, and ablation study on the model's components. We publish the code and the models here https://www.github.com/jsvir/vad.

Tasks

Action Detection Activity Detection Denoising

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
AVA-Speech	SG-VAD (ours)	ROC-AUC	94.3	—	Unverified

SG-VAD: Stochastic Gates Based Speech Activity Detection

Code

Abstract

Tasks

Benchmark Results

Reproductions