Sound Event Localization and Detection

Given multichannel audio input, a sound event detection and localization (SELD) system outputs a temporal activation track for each of the target sound classes, along with one or more corresponding spatial trajectories when the track indicates activity. This results in a spatio-temporal characterization of the acoustic scene that can be used in a wide range of machine cognition tasks, such as inference on the type of environment, self-localization, navigation without visual input or with occluded targets, tracking of specific types of sound sources, smart-home applications, scene visualization systems, and audio surveillance, among others.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–50 of 65 papers

Title	Date	Tasks	Status	Hype
Spatial and Semantic Embedding Integration for Stereo Sound Event Localization and Detection in Regular Videos	Jul 7, 2025	Sound Event Localization and Detection	—Unverified	0
Stereo sound event localization and detection based on PSELDnet pretraining and BiMamba sequence modeling	Jun 16, 2025	DecoderMamba	—Unverified	0
CST-former: Multidimensional Attention-based Transformer for Sound Event Localization and Detection in Real Scenes	Apr 17, 2025	Event DetectionSound Event Localization and Detection	—Unverified	0
Reverberation-based Features for Sound Event Localization and Detection with Distance Estimation	Apr 11, 2025	Direction of Arrival EstimationSound Event Localization and Detection	CodeCode Available	0
An Experimental Study on Joint Modeling for Sound Event Localization and Detection with Source Distance Estimation	Jan 18, 2025	Event DetectionSound Event Detection	—Unverified	0
MVANet: Multi-Stage Video Attention Network for Sound Event Localization and Detection with Source Distance Estimation	Nov 21, 2024	Data AugmentationSound Event Localization and Detection	CodeCode Available	0
Class-Incremental Learning for Sound Event Localization and Detection	Nov 19, 2024	class-incremental learningClass Incremental Learning	—Unverified	0
PSELDNets: Pre-trained Neural Networks on Large-scale Synthetic Datasets for Sound Event Localization and Detection	Nov 10, 2024	Direction of Arrival EstimationSound Event Localization and Detection	CodeCode Available	1
DOA-Aware Audio-Visual Self-Supervised Learning for Sound Event Localization and Detection	Oct 30, 2024	Contrastive LearningSelf-Supervised Learning	—Unverified	0
Leveraging Reverberation and Visual Depth Cues for Sound Event Localization and Detection with Distance Estimation	Oct 29, 2024	Sound Event Localization and Detection	—Unverified	0
Real-Time Sound Event Localization and Detection: Deployment Challenges on Edge Devices	Sep 18, 2024	Raspberry Pi 3Sound Event Localization and Detection	CodeCode Available	0
Learning Spatially-Aware Language and Audio Embeddings	Sep 17, 2024	AttributeContrastive Learning	—Unverified	0
Learning Multi-Target TDOA Features for Sound Event Localization and Detection	Aug 30, 2024	Sound Event Localization and Detection	CodeCode Available	1
SELD-Mamba: Selective State-Space Model for Sound Event Localization and Detection with Source Distance Estimation	Aug 9, 2024	Computational EfficiencyEvent Detection	—Unverified	0
Squeeze-and-Excite ResNet-Conformers for Sound Event Localization, Detection, and Distance Estimation for DCASE 2024 Challenge	Jul 12, 2024	Sound Event Localization and Detection	—Unverified	0
Text-Queried Target Sound Event Localization	Jun 23, 2024	Room Impulse Response (RIR)Sound Event Localization and Detection	—Unverified	0
Exploring Audio-Visual Information Fusion for Sound Event Localization and Detection In Low-Resource Realistic Scenarios	Jun 21, 2024	Data AugmentationSound Event Localization and Detection	—Unverified	0
MFF-EINV2: Multi-scale Feature Fusion across Spectral-Spatial-Temporal Domains for Sound Event Localization and Detection	Jun 13, 2024	Sound Event Localization and Detection	CodeCode Available	1
6DoF SELD: Sound Event Localization and Detection Using Microphones and Motion Tracking Sensors on self-motioning human	Mar 4, 2024	Sound Event Localization and Detection	—Unverified	0
Overview of the L3DAS23 Challenge on Audio-Visual Extended Reality	Feb 14, 2024	Audio Signal ProcessingSound Event Localization and Detection	—Unverified	0
BAT: Learning to Reason about Spatial Sounds with Large Language Models	Feb 2, 2024	Event DetectionLanguage Modelling	—Unverified	0
Enhanced Sound Event Localization and Detection in Real 360-degree audio-visual soundscapes	Jan 29, 2024	Data AugmentationSound Event Localization and Detection	CodeCode Available	1
Spatial Scaper: A Library to Simulate and Augment Soundscapes for Sound Event Localization and Detection in Realistic Rooms	Jan 19, 2024	Data AugmentationDiversity	CodeCode Available	2
Selective-Memory Meta-Learning with Environment Representations for Sound Event Localization and Detection	Dec 27, 2023	Meta-LearningSound Event Localization and Detection	CodeCode Available	1
CST-former: Transformer with Channel-Spectro-Temporal Attention for Sound Event Localization and Detection	Dec 20, 2023	Sound Event Localization and Detection	—Unverified	0
Fusion of Audio and Visual Embeddings for Sound Event Localization and Detection	Dec 14, 2023	Data AugmentationEvent Detection	CodeCode Available	1
w2v-SELD: A Sound Event Localization and Detection Framework for Self-Supervised Spatial Audio Pre-Training	Dec 12, 2023	Event DetectionSound Event Detection	CodeCode Available	1
Feature Aggregation in Joint Sound Classification and Localization Neural Networks	Oct 29, 2023	regressionSound Classification	—Unverified	0
SwG-former: A Sliding-Window Graph Convolutional Network for Simultaneous Spatial-Temporal Information Extraction in Sound Event Localization and Detection	Oct 21, 2023	Event DetectionSound Event Detection	—Unverified	0
Leveraging Geometrical Acoustic Simulations of Spatial Room Impulse Responses for Improved Sound Event Detection and Localization	Sep 6, 2023	Event DetectionSound Event Detection	CodeCode Available	0
META-SELD: Meta-Learning for Fast Adaptation to the new environment in Sound Event Localization and Detection	Aug 17, 2023	Meta-LearningSound Event Localization and Detection	—Unverified	0
Dynamic Kernel Convolution Network with Scene-dedicate Training for Sound Event Localization and Detection	Jul 17, 2023	Data AugmentationSound Event Localization and Detection	—Unverified	0
STARSS23: An Audio-Visual Dataset of Spatial Recordings of Real Scenes with Spatiotemporal Annotations of Sound Events	Jun 15, 2023	Sound Event Localization and Detection	CodeCode Available	1
Divided spectro-temporal attention for sound event localization and detection in real scenes for DCASE2023 challenge	Jun 5, 2023	Event DetectionSound Event Detection	—Unverified	0
Perception Test: A Diagnostic Benchmark for Multimodal Video Models	May 23, 2023	DiagnosticGrounded Video Question Answering	CodeCode Available	2
AD-YOLO: You Look Only Once in Training Multiple Sound Event Localization and Detection	Mar 28, 2023	Direction of Arrival EstimationSound Event Detection	CodeCode Available	1
CoLoC: Conditioned Localizer and Classifier for Sound Event Localization and Detection	Oct 25, 2022	Sound Event Localization and Detection	—Unverified	0
Sound Event Localization and Detection for Real Spatial Sound Scenes: Event-Independent Network and Data Augmentation Chains	Sep 5, 2022	Data AugmentationDirection of Arrival Estimation	CodeCode Available	1
Data Augmentation and Squeeze-and-Excitation Network on Multiple Dimension for Sound Event Localization and Detection in Real Scenes	Jun 24, 2022	Data AugmentationSound Event Localization and Detection	—Unverified	0
A Synapse-Threshold Synergistic Learning Approach for Spiking Neural Networks	Jun 10, 2022	Event data classificationGesture Recognition	CodeCode Available	0
STARSS22: A dataset of spatial recordings of real scenes with spatiotemporal annotations of sound events	Jun 4, 2022	Sound Event Localization and Detection	CodeCode Available	1
Dual Quaternion Ambisonics Array for Six-Degree-of-Freedom Acoustic Representation	Apr 4, 2022	Sound Event Localization and Detection	CodeCode Available	0
Filler Word Detection and Classification: A Dataset and Benchmark	Mar 28, 2022	ClassificationKeyword Spotting	CodeCode Available	0
Locate This, Not That: Class-Conditioned Sound Event DOA Estimation	Mar 8, 2022	AllSound Event Localization and Detection	—Unverified	0
L3DAS22 Challenge: Learning 3D Audio Sources in a Real Office Environment	Feb 21, 2022	Sound Event Localization and DetectionSpeech Enhancement	CodeCode Available	1
Echo-aware Adaptation of Sound Event Localization and Detection in Unknown Environments	Feb 18, 2022	Domain AdaptationSound Event Localization and Detection	CodeCode Available	0
Wearable SELD dataset: Dataset for sound event localization and detection using wearable devices around head	Feb 17, 2022	Sound Event Localization and Detection	CodeCode Available	0
SALSA-Lite: A Fast and Effective Feature for Polyphonic Sound Event Localization and Detection with Microphone Arrays	Nov 16, 2021	Sound Event Localization and Detection	CodeCode Available	1
Multi-ACCDOA: Localizing and Detecting Overlapping Sounds from the Same Class with Auxiliary Duplicating Permutation Invariant Training	Oct 14, 2021	Sound Event Localization and Detection	CodeCode Available	1
Spatial mixup: Directional loudness modification as data augmentation for sound event localization and detection	Oct 12, 2021	Data AugmentationSound Event Localization and Detection	CodeCode Available	0

Show:10 25 50

← PrevPage 1 of 2Next →

All datasets PodcastFillers STARSS22 L3DAS21 RWCP Sound Scene Database TAU-NIGENS Spatial Sound Events 2021

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	AVC-FillerNet	event-based F1 score	92.8	—	Unverified
2	VC-FillerNet	event-based F1 score	71	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Baseline (MIC)	Class-dependent localization error	32.2	—	Unverified
2	Baseline (FOA)	Class-dependent localization error	29.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DualQSELD-TCN (parallel)	SELD score	0.32	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	STL-SNN	accuracy	98.4	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SALSA-FOA	ER≤20°	0.38	—	Unverified