SOTAVerified

Sound Event Localization and Detection

Given multichannel audio input, a sound event detection and localization (SELD) system outputs a temporal activation track for each of the target sound classes, along with one or more corresponding spatial trajectories when the track indicates activity. This results in a spatio-temporal characterization of the acoustic scene that can be used in a wide range of machine cognition tasks, such as inference on the type of environment, self-localization, navigation without visual input or with occluded targets, tracking of specific types of sound sources, smart-home applications, scene visualization systems, and audio surveillance, among others.

Papers

Showing 110 of 65 papers

TitleStatusHype
Spatial Scaper: A Library to Simulate and Augment Soundscapes for Sound Event Localization and Detection in Realistic RoomsCode2
Perception Test: A Diagnostic Benchmark for Multimodal Video ModelsCode2
PSELDNets: Pre-trained Neural Networks on Large-scale Synthetic Datasets for Sound Event Localization and DetectionCode1
Learning Multi-Target TDOA Features for Sound Event Localization and DetectionCode1
MFF-EINV2: Multi-scale Feature Fusion across Spectral-Spatial-Temporal Domains for Sound Event Localization and DetectionCode1
Enhanced Sound Event Localization and Detection in Real 360-degree audio-visual soundscapesCode1
Selective-Memory Meta-Learning with Environment Representations for Sound Event Localization and DetectionCode1
Fusion of Audio and Visual Embeddings for Sound Event Localization and DetectionCode1
w2v-SELD: A Sound Event Localization and Detection Framework for Self-Supervised Spatial Audio Pre-TrainingCode1
STARSS23: An Audio-Visual Dataset of Spatial Recordings of Real Scenes with Spatiotemporal Annotations of Sound EventsCode1
Show:102550
← PrevPage 1 of 7Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1AVC-FillerNetevent-based F1 score92.8Unverified
2VC-FillerNetevent-based F1 score71Unverified
#ModelMetricClaimedVerifiedStatus
1Baseline (MIC)Class-dependent localization error32.2Unverified
2Baseline (FOA)Class-dependent localization error29.3Unverified
#ModelMetricClaimedVerifiedStatus
1DualQSELD-TCN (parallel)SELD score0.32Unverified
#ModelMetricClaimedVerifiedStatus
1STL-SNNaccuracy98.4Unverified
#ModelMetricClaimedVerifiedStatus
1SALSA-FOAER≤20°0.38Unverified