SOTAVerified

Sound Event Detection

Sound Event Detection (SED) is the task of recognizing the sound events and their respective temporal start and end time in a recording. Sound events in real life do not always occur in isolation, but tend to considerably overlap with each other. Recognizing such overlapping sound events is referred as polyphonic SED.

Source: A report on sound event detection with different binaural features

Papers

Showing 125 of 194 papers

TitleStatusHype
MAT-SED: A Masked Audio Transformer with Masked-Reconstruction Based Pre-training for Sound Event DetectionCode2
Prototype based Masked Audio Model for Self-Supervised Learning of Sound Event DetectionCode2
Mind the Domain Gap: a Systematic Analysis on Bioacoustic Sound Event DetectionCode2
HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and DetectionCode2
WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal ResearchCode2
A Hybrid System of Sound Event Detection Transformer and Frame-wise Model for DCASE 2022 Task 4Code1
Fusion of Audio and Visual Embeddings for Sound Event Localization and DetectionCode1
Improving Audio Spectrogram Transformers for Sound Event Detection Through Multi-Stage TrainingCode1
FilterAugment: An Acoustic Environmental Data Augmentation MethodCode1
Few-shot bioacoustic event detection at the DCASE 2022 challengeCode1
Frequency Dynamic Convolution: Frequency-Adaptive Pattern Recognition for Sound Event DetectionCode1
Full-frequency dynamic convolution: a physical frequency-dependent convolution for sound event detectionCode1
Fine-tune the pretrained ATST model for sound event detectionCode1
Heavily Augmented Sound Event Detection utilizing Weak PredictionsCode1
A dataset for Audio-Visual Sound Event Detection in MoviesCode1
Diversifying and Expanding Frequency-Adaptive Convolution Kernels for Sound Event DetectionCode1
DCASE 2021 Task 3: Spectrotemporally-aligned Features for Polyphonic Sound Event Localization and DetectionCode1
DENet: a deep architecture for audio surveillance applicationsCode1
Exploring Performance-Complexity Trade-Offs in Sound Event Detection ModelsCode1
Exploring Text-Queried Sound Event Detection with Audio Source SeparationCode1
AD-YOLO: You Look Only Once in Training Multiple Sound Event Localization and DetectionCode1
Few-shot bioacoustic event detection at the DCASE 2023 challengeCode1
ACCDOA: Activity-Coupled Cartesian Direction of Arrival Representation for Sound Event Localization and DetectionCode1
Frequency Dependent Sound Event Detection for DCASE 2022 Challenge Task 4Code1
Couple Learning for semi-supervised sound event detectionCode1
Show:102550
← PrevPage 1 of 8Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ATST-SEDevent-based F1 score63.4Unverified
2SE-CRNN-16 with DualKDevent-based F1 score55.6Unverified
3FDY-CRNNevent-based F1 score54Unverified
4HTS-ATevent-based F1 score50.7Unverified
5RCTevent-based F1 score49.62Unverified
6FiltAug SEDevent-based F1 score49.6Unverified
7SED-SSep baseline dcase task 4 2020 v2event-based F1 score40.7Unverified
8Baseline dcase task 4 2020 v2event-based F1 score39Unverified
9Baselineevent-based F1 score25.8Unverified
10MAT-SEDPSDS10.59Unverified
#ModelMetricClaimedVerifiedStatus
1PHC SEDnet n=8Error Rate0.56Unverified
2Quaternion SEDnetError Rate0.52Unverified
3PHC SEDnet n=16Error Rate0.51Unverified
4PHC SEDnet n=4Error Rate0.45Unverified
5PHC SEDnet n=2Error Rate0.39Unverified
#ModelMetricClaimedVerifiedStatus
1CRNN (with BEATs + Separation)PSDS1 (-5dB)0.13Unverified
2CRNN (with BEATs)PSDS1 (-5dB)0.07Unverified
3CRNN (WildDESED + Curriculrm learning)PSDS1 (-5dB)0.05Unverified
4CRNN (WildDESED)PSDS1 (-5dB)0.05Unverified
5CRNNPSDS1 (-5dB)0.02Unverified
#ModelMetricClaimedVerifiedStatus
1DENetRank-1 Recognition Rate0.98Unverified
#ModelMetricClaimedVerifiedStatus
1DENetRank-1 Recognition Rate1Unverified