SOTAVerified

Sound Event Localization and Detection

Given multichannel audio input, a sound event detection and localization (SELD) system outputs a temporal activation track for each of the target sound classes, along with one or more corresponding spatial trajectories when the track indicates activity. This results in a spatio-temporal characterization of the acoustic scene that can be used in a wide range of machine cognition tasks, such as inference on the type of environment, self-localization, navigation without visual input or with occluded targets, tracking of specific types of sound sources, smart-home applications, scene visualization systems, and audio surveillance, among others.

Papers

Showing 2650 of 65 papers

TitleStatusHype
An Experimental Study on Joint Modeling for Sound Event Localization and Detection with Source Distance Estimation0
A Sequence Matching Network for Polyphonic Sound Event Localization and Detection0
BAT: Learning to Reason about Spatial Sounds with Large Language Models0
Class-Incremental Learning for Sound Event Localization and Detection0
CoLoC: Conditioned Localizer and Classifier for Sound Event Localization and Detection0
CST-former: Multidimensional Attention-based Transformer for Sound Event Localization and Detection in Real Scenes0
CST-former: Transformer with Channel-Spectro-Temporal Attention for Sound Event Localization and Detection0
Data Augmentation and Squeeze-and-Excitation Network on Multiple Dimension for Sound Event Localization and Detection in Real Scenes0
Divided spectro-temporal attention for sound event localization and detection in real scenes for DCASE2023 challenge0
DOA-Aware Audio-Visual Self-Supervised Learning for Sound Event Localization and Detection0
Dynamic Kernel Convolution Network with Scene-dedicate Training for Sound Event Localization and Detection0
Ensemble of ACCDOA- and EINV2-based Systems with D3Nets and Impulse Response Simulation for Sound Event Localization and Detection0
Exploring Audio-Visual Information Fusion for Sound Event Localization and Detection In Low-Resource Realistic Scenarios0
Feature Aggregation in Joint Sound Classification and Localization Neural Networks0
Learning Spatially-Aware Language and Audio Embeddings0
Leveraging Reverberation and Visual Depth Cues for Sound Event Localization and Detection with Distance Estimation0
6DoF SELD: Sound Event Localization and Detection Using Microphones and Motion Tracking Sensors on self-motioning human0
META-SELD: Meta-Learning for Fast Adaptation to the new environment in Sound Event Localization and Detection0
Mobile Microphone Array Speech Detection and Localization in Diverse Everyday Environments0
Overview of the L3DAS23 Challenge on Audio-Visual Extended Reality0
SELD-Mamba: Selective State-Space Model for Sound Event Localization and Detection with Source Distance Estimation0
Sound Event Localization based on Sound Intensity Vector Refined By DNN-Based Denoising and Source Separation0
Sound source detection, localization and classification using consecutive ensemble of CRNN models0
Spatial and Semantic Embedding Integration for Stereo Sound Event Localization and Detection in Regular Videos0
Squeeze-and-Excite ResNet-Conformers for Sound Event Localization, Detection, and Distance Estimation for DCASE 2024 Challenge0
Show:102550
← PrevPage 2 of 3Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1AVC-FillerNetevent-based F1 score92.8Unverified
2VC-FillerNetevent-based F1 score71Unverified
#ModelMetricClaimedVerifiedStatus
1Baseline (MIC)Class-dependent localization error32.2Unverified
2Baseline (FOA)Class-dependent localization error29.3Unverified
#ModelMetricClaimedVerifiedStatus
1DualQSELD-TCN (parallel)SELD score0.32Unverified
#ModelMetricClaimedVerifiedStatus
1STL-SNNaccuracy98.4Unverified
#ModelMetricClaimedVerifiedStatus
1SALSA-FOAER≤20°0.38Unverified