SOTAVerified

Sound Event Detection

Sound Event Detection (SED) is the task of recognizing the sound events and their respective temporal start and end time in a recording. Sound events in real life do not always occur in isolation, but tend to considerably overlap with each other. Recognizing such overlapping sound events is referred as polyphonic SED.

Source: A report on sound event detection with different binaural features

Papers

Showing 150 of 194 papers

TitleStatusHype
HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and DetectionCode2
Mind the Domain Gap: a Systematic Analysis on Bioacoustic Sound Event DetectionCode2
Prototype based Masked Audio Model for Self-Supervised Learning of Sound Event DetectionCode2
MAT-SED: A Masked Audio Transformer with Masked-Reconstruction Based Pre-training for Sound Event DetectionCode2
WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal ResearchCode2
Towards Deep Learning Models Resistant to Adversarial AttacksCode1
Sound Event Detection Transformer: An Event-based End-to-End Model for Sound Event DetectionCode1
UniAV: Unified Audio-Visual Perception for Multi-Task Video Event LocalizationCode1
Exploring Performance-Complexity Trade-Offs in Sound Event Detection ModelsCode1
Multi-Iteration Multi-Stage Fine-Tuning of Transformers for Sound Event Detection with Heterogeneous DatasetsCode1
SELD-TCN: Sound Event Localization & Detection via Temporal Convolutional NetworksCode1
Sound Event Bounding BoxesCode1
Sound Event Detection with Depthwise Separable and Dilated ConvolutionsCode1
Threshold Independent Evaluation of Sound Event Detection ScoresCode1
Pushing the Limit of Sound Event Detection with Multi-Dilated Frequency Dynamic ConvolutionCode1
Heavily Augmented Sound Event Detection utilizing Weak PredictionsCode1
Fusion of Audio and Visual Embeddings for Sound Event Localization and DetectionCode1
Improving Audio Spectrogram Transformers for Sound Event Detection Through Multi-Stage TrainingCode1
Leveraging LLM and Text-Queried Separation for Noise-Robust Sound Event DetectionCode1
PHNNs: Lightweight Neural Networks via Parameterized Hypercomplex ConvolutionsCode1
AD-YOLO: You Look Only Once in Training Multiple Sound Event Localization and DetectionCode1
Pretraining Representations for Bioacoustic Few-shot Detection using Supervised Contrastive LearningCode1
Few-shot bioacoustic event detection at the DCASE 2023 challengeCode1
SALSA: Spatial Cue-Augmented Log-Spectrogram Features for Polyphonic Sound Event Localization and DetectionCode1
Self-supervised Audio Teacher-Student Transformer for Both Clip-level and Frame-level TasksCode1
Self Training and Ensembling Frequency Dependent Networks with Coarse Prediction Pooling and Sound Event Bounding BoxesCode1
A Hybrid System of Sound Event Detection Transformer and Frame-wise Model for DCASE 2022 Task 4Code1
Few-shot bioacoustic event detection at the DCASE 2022 challengeCode1
Sound Event Localization and Detection for Real Spatial Sound Scenes: Event-Independent Network and Data Augmentation ChainsCode1
The impact of non-target events in synthetic soundscapes for sound event detectionCode1
Frequency Dependent Sound Event Detection for DCASE 2022 Challenge Task 4Code1
Forward-Backward Convolutional Recurrent Neural Networks and Tag-Conditioned Convolutional Neural Networks for Weakly Labeled Semi-supervised Sound Event DetectionCode1
Frequency Dynamic Convolution: Frequency-Adaptive Pattern Recognition for Sound Event DetectionCode1
Couple Learning for semi-supervised sound event detectionCode1
A dataset for Audio-Visual Sound Event Detection in MoviesCode1
DCASE 2021 Task 3: Spectrotemporally-aligned Features for Polyphonic Sound Event Localization and DetectionCode1
FilterAugment: An Acoustic Environmental Data Augmentation MethodCode1
DCASENET: A joint pre-trained deep neural network for detecting and classifying acoustic scenes and eventsCode1
Exploring Text-Queried Sound Event Detection with Audio Source SeparationCode1
DENet: a deep architecture for audio surveillance applicationsCode1
Fine-tune the pretrained ATST model for sound event detectionCode1
Full-frequency dynamic convolution: a physical frequency-dependent convolution for sound event detectionCode1
Improving weakly supervised sound event detection with self-supervised auxiliary tasksCode1
Diversifying and Expanding Frequency-Adaptive Convolution Kernels for Sound Event DetectionCode1
Multi-Task Learning for Interpretable Weakly Labelled Sound Event DetectionCode1
Post-Processing Independent Evaluation of Sound Event Detection SystemsCode1
ACCDOA: Activity-Coupled Cartesian Direction of Arrival Representation for Sound Event Localization and DetectionCode1
Effective Pre-Training of Audio Transformers for Sound Event DetectionCode1
Revisiting Deep Audio-Text Retrieval Through the Lens of TransportationCode1
RCT: Random Consistency Training for Semi-supervised Sound Event DetectionCode1
Show:102550
← PrevPage 1 of 4Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ATST-SEDevent-based F1 score63.4Unverified
2SE-CRNN-16 with DualKDevent-based F1 score55.6Unverified
3FDY-CRNNevent-based F1 score54Unverified
4HTS-ATevent-based F1 score50.7Unverified
5RCTevent-based F1 score49.62Unverified
6FiltAug SEDevent-based F1 score49.6Unverified
7SED-SSep baseline dcase task 4 2020 v2event-based F1 score40.7Unverified
8Baseline dcase task 4 2020 v2event-based F1 score39Unverified
9Baselineevent-based F1 score25.8Unverified
10MAT-SEDPSDS10.59Unverified
#ModelMetricClaimedVerifiedStatus
1PHC SEDnet n=8Error Rate0.56Unverified
2Quaternion SEDnetError Rate0.52Unverified
3PHC SEDnet n=16Error Rate0.51Unverified
4PHC SEDnet n=4Error Rate0.45Unverified
5PHC SEDnet n=2Error Rate0.39Unverified
#ModelMetricClaimedVerifiedStatus
1CRNN (with BEATs + Separation)PSDS1 (-5dB)0.13Unverified
2CRNN (with BEATs)PSDS1 (-5dB)0.07Unverified
3CRNN (WildDESED + Curriculrm learning)PSDS1 (-5dB)0.05Unverified
4CRNN (WildDESED)PSDS1 (-5dB)0.05Unverified
5CRNNPSDS1 (-5dB)0.02Unverified
#ModelMetricClaimedVerifiedStatus
1DENetRank-1 Recognition Rate0.98Unverified
#ModelMetricClaimedVerifiedStatus
1DENetRank-1 Recognition Rate1Unverified