EfficientLEAF: A Faster LEarnable Audio Frontend of Questionable Use

2022-07-12Code Available1· sign in to hype

Jan Schlüter, Gerald Gutenbrunner

Code Available — Be the first to reproduce this paper.

Code

github.com/cpjku/efficientleaf
OfficialIn paperpytorch★ 49

Abstract

In audio classification, differentiable auditory filterbanks with few parameters cover the middle ground between hard-coded spectrograms and raw audio. LEAF (arXiv:2101.08596), a Gabor-based filterbank combined with Per-Channel Energy Normalization (PCEN), has shown promising results, but is computationally expensive. With inhomogeneous convolution kernel sizes and strides, and by replacing PCEN with better parallelizable operations, we can reach similar results more efficiently. In experiments on six audio classification tasks, our frontend matches the accuracy of LEAF at 3% of the cost, but both fail to consistently outperform a fixed mel filterbank. The quest for learnable audio frontends is not solved.

Tasks

Audio Classification Classification Instrument Recognition Pitch Classification Spoken language identification

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
BirdCLEF 2021	LEAF	Accuracy	42.3	—	Unverified
BirdCLEF 2021	melspect	Accuracy	39.9	—	Unverified
BirdCLEF 2021	EfficientLEAF (8s)	Accuracy	72.2	—	Unverified
BirdCLEF 2021	EfficientLEAF	Accuracy	42.9	—	Unverified
CREMA-D	EfficientLEAF	Accuracy	60.2	—	Unverified
CREMA-D	melspect	Accuracy	58.8	—	Unverified
CREMA-D	LEAF	Accuracy	50.2	—	Unverified
Speech Commands	EfficientLEAF	Accuracy	95.2	—	Unverified
Speech Commands	LEAF	Accuracy	95.1	—	Unverified
Speech Commands	melspect	Accuracy	95.1	—	Unverified

EfficientLEAF: A Faster LEarnable Audio Frontend of Questionable Use

Code

Abstract

Tasks

Benchmark Results

Reproductions