Sound Event Localization and Detection

Given multichannel audio input, a sound event detection and localization (SELD) system outputs a temporal activation track for each of the target sound classes, along with one or more corresponding spatial trajectories when the track indicates activity. This results in a spatio-temporal characterization of the acoustic scene that can be used in a wide range of machine cognition tasks, such as inference on the type of environment, self-localization, navigation without visual input or with occluded targets, tracking of specific types of sound sources, smart-home applications, scene visualization systems, and audio surveillance, among others.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–25 of 65 papers

Title	Date	Tasks	Status	Hype
Perception Test: A Diagnostic Benchmark for Multimodal Video Models	May 23, 2023	DiagnosticGrounded Video Question Answering	CodeCode Available	2
Spatial Scaper: A Library to Simulate and Augment Soundscapes for Sound Event Localization and Detection in Realistic Rooms	Jan 19, 2024	Data AugmentationDiversity	CodeCode Available	2
What Makes Sound Event Localization and Detection Difficult? Insights from Error Analysis	Jul 22, 2021	Direction of Arrival EstimationEvent Detection	CodeCode Available	1
SALSA-Lite: A Fast and Effective Feature for Polyphonic Sound Event Localization and Detection with Microphone Arrays	Nov 16, 2021	Sound Event Localization and Detection	CodeCode Available	1
Selective-Memory Meta-Learning with Environment Representations for Sound Event Localization and Detection	Dec 27, 2023	Meta-LearningSound Event Localization and Detection	CodeCode Available	1
w2v-SELD: A Sound Event Localization and Detection Framework for Self-Supervised Spatial Audio Pre-Training	Dec 12, 2023	Event DetectionSound Event Detection	CodeCode Available	1
MFF-EINV2: Multi-scale Feature Fusion across Spectral-Spatial-Temporal Domains for Sound Event Localization and Detection	Jun 13, 2024	Sound Event Localization and Detection	CodeCode Available	1
A Dataset of Reverberant Spatial Sound Scenes with Moving Sources for Sound Event Localization and Detection	Jun 2, 2020	Sound Event Localization and Detection	CodeCode Available	1
Overview and Evaluation of Sound Event Localization and Detection in DCASE 2019	Sep 6, 2020	Data AugmentationSound Event Localization and Detection	CodeCode Available	1
PSELDNets: Pre-trained Neural Networks on Large-scale Synthetic Datasets for Sound Event Localization and Detection	Nov 10, 2024	Direction of Arrival EstimationSound Event Localization and Detection	CodeCode Available	1
A Dataset of Dynamic Reverberant Sound Scenes with Directional Interferers for Sound Event Localization and Detection	Jun 13, 2021	Sound Event Localization and Detection	CodeCode Available	1
Sound Event Localization and Detection for Real Spatial Sound Scenes: Event-Independent Network and Data Augmentation Chains	Sep 5, 2022	Data AugmentationDirection of Arrival Estimation	CodeCode Available	1
STARSS22: A dataset of spatial recordings of real scenes with spatiotemporal annotations of sound events	Jun 4, 2022	Sound Event Localization and Detection	CodeCode Available	1
STARSS23: An Audio-Visual Dataset of Spatial Recordings of Real Scenes with Spatiotemporal Annotations of Sound Events	Jun 15, 2023	Sound Event Localization and Detection	CodeCode Available	1
DCASE 2021 Task 3: Spectrotemporally-aligned Features for Polyphonic Sound Event Localization and Detection	Jun 29, 2021	Audio ClassificationDirection of Arrival Estimation	CodeCode Available	1
Fusion of Audio and Visual Embeddings for Sound Event Localization and Detection	Dec 14, 2023	Data AugmentationEvent Detection	CodeCode Available	1
ACCDOA: Activity-Coupled Cartesian Direction of Arrival Representation for Sound Event Localization and Detection	Oct 29, 2020	Event DetectionSound Event Detection	CodeCode Available	1
Learning Multi-Target TDOA Features for Sound Event Localization and Detection	Aug 30, 2024	Sound Event Localization and Detection	CodeCode Available	1
AD-YOLO: You Look Only Once in Training Multiple Sound Event Localization and Detection	Mar 28, 2023	Direction of Arrival EstimationSound Event Detection	CodeCode Available	1
L3DAS22 Challenge: Learning 3D Audio Sources in a Real Office Environment	Feb 21, 2022	Sound Event Localization and DetectionSpeech Enhancement	CodeCode Available	1
Enhanced Sound Event Localization and Detection in Real 360-degree audio-visual soundscapes	Jan 29, 2024	Data AugmentationSound Event Localization and Detection	CodeCode Available	1
Multi-ACCDOA: Localizing and Detecting Overlapping Sounds from the Same Class with Auxiliary Duplicating Permutation Invariant Training	Oct 14, 2021	Sound Event Localization and Detection	CodeCode Available	1
SALSA: Spatial Cue-Augmented Log-Spectrogram Features for Polyphonic Sound Event Localization and Detection	Oct 1, 2021	Direction of Arrival EstimationEvent Detection	CodeCode Available	1
SELD-TCN: Sound Event Localization & Detection via Temporal Convolutional Networks	Mar 3, 2020	Acoustic Scene ClassificationGPU	CodeCode Available	1
Data Augmentation and Squeeze-and-Excitation Network on Multiple Dimension for Sound Event Localization and Detection in Real Scenes	Jun 24, 2022	Data AugmentationSound Event Localization and Detection	—Unverified	0

Show:10 25 50

← PrevPage 1 of 3Next →

All datasets PodcastFillers STARSS22 L3DAS21 RWCP Sound Scene Database TAU-NIGENS Spatial Sound Events 2021

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	AVC-FillerNet	event-based F1 score	92.8	—	Unverified
2	VC-FillerNet	event-based F1 score	71	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Baseline (MIC)	Class-dependent localization error	32.2	—	Unverified
2	Baseline (FOA)	Class-dependent localization error	29.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DualQSELD-TCN (parallel)	SELD score	0.32	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	STL-SNN	accuracy	98.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SALSA-FOA	ER≤20°	0.38	—	Unverified