LHGNN: Local-Higher Order Graph Neural Networks For Audio Classification and Tagging

2025-01-07Unverified0· sign in to hype

Shubhr Singh, Emmanouil Benetos, Huy Phan, Dan Stowell

Unverified — Be the first to reproduce this paper.

Abstract

Transformers have set new benchmarks in audio processing tasks, leveraging self-attention mechanisms to capture complex patterns and dependencies within audio data. However, their focus on pairwise interactions limits their ability to process the higher-order relations essential for identifying distinct audio objects. To address this limitation, this work introduces the Local- Higher Order Graph Neural Network (LHGNN), a graph based model that enhances feature understanding by integrating local neighbourhood information with higher-order data from Fuzzy C-Means clusters, thereby capturing a broader spectrum of audio relationships. Evaluation of the model on three publicly available audio datasets shows that it outperforms Transformer-based models across all benchmarks while operating with substantially fewer parameters. Moreover, LHGNN demonstrates a distinct advantage in scenarios lacking ImageNet pretraining, establishing its effectiveness and efficiency in environments where extensive pretraining data is unavailable.

Tasks

Audio Classification Graph Neural Network

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
Audio Set	LHGNN	Mean AP	46.6	—	Unverified
ESC-50	LHGNN	Top-1 Accuracy	96.2	—	Unverified
FSD50K	LHGNN	Mean AP	59	—	Unverified

LHGNN: Local-Higher Order Graph Neural Networks For Audio Classification and Tagging

Abstract

Tasks

Benchmark Results

Reproductions