Sound Event Detection

Sound Event Detection (SED) is the task of recognizing the sound events and their respective temporal start and end time in a recording. Sound events in real life do not always occur in isolation, but tend to considerably overlap with each other. Recognizing such overlapping sound events is referred as polyphonic SED.

Source: A report on sound event detection with different binaural features

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–50 of 194 papers

Title	Date	Tasks	Status	Hype
HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection	Feb 2, 2022	Audio ClassificationEvent Detection	CodeCode Available	2
Mind the Domain Gap: a Systematic Analysis on Bioacoustic Sound Event Detection	Mar 27, 2024	Data AugmentationDomain Adaptation	CodeCode Available	2
Prototype based Masked Audio Model for Self-Supervised Learning of Sound Event Detection	Sep 26, 2024	Event DetectionRepresentation Learning	CodeCode Available	2
MAT-SED: A Masked Audio Transformer with Masked-Reconstruction Based Pre-training for Sound Event Detection	Aug 16, 2024	Event DetectionSound Event Detection	CodeCode Available	2
WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal Research	Mar 30, 2023	Audio captioningEvent Detection	CodeCode Available	2
Towards Deep Learning Models Resistant to Adversarial Attacks	Jun 19, 2017	Adversarial AttackAdversarial Defense	CodeCode Available	1
Sound Event Detection Transformer: An Event-based End-to-End Model for Sound Event Detection	Oct 5, 2021	Audio TaggingBoundary Detection	CodeCode Available	1
UniAV: Unified Audio-Visual Perception for Multi-Task Video Event Localization	Apr 4, 2024	Action Localizationaudio-visual event localization	CodeCode Available	1
Exploring Performance-Complexity Trade-Offs in Sound Event Detection Models	Mar 14, 2025	Audio TaggingEvent Detection	CodeCode Available	1
Multi-Iteration Multi-Stage Fine-Tuning of Transformers for Sound Event Detection with Heterogeneous Datasets	Jul 17, 2024	Event DetectionSound Event Detection	CodeCode Available	1
SELD-TCN: Sound Event Localization & Detection via Temporal Convolutional Networks	Mar 3, 2020	Acoustic Scene ClassificationGPU	CodeCode Available	1
Sound Event Bounding Boxes	Jun 6, 2024	Change DetectionEvent Detection	CodeCode Available	1
Sound Event Detection with Depthwise Separable and Dilated Convolutions	Feb 2, 2020	Event DetectionSound Event Detection	CodeCode Available	1
Threshold Independent Evaluation of Sound Event Detection Scores	Jan 31, 2022	Event DetectionSound Event Detection	CodeCode Available	1
Pushing the Limit of Sound Event Detection with Multi-Dilated Frequency Dynamic Convolution	Jun 19, 2024	Event DetectionSound Event Detection	CodeCode Available	1
Heavily Augmented Sound Event Detection utilizing Weak Predictions	Jul 8, 2021	Data AugmentationEvent Detection	CodeCode Available	1
Fusion of Audio and Visual Embeddings for Sound Event Localization and Detection	Dec 14, 2023	Data AugmentationEvent Detection	CodeCode Available	1
Improving Audio Spectrogram Transformers for Sound Event Detection Through Multi-Stage Training	Jul 17, 2024	Event DetectionMissing Labels	CodeCode Available	1
Leveraging LLM and Text-Queried Separation for Noise-Robust Sound Event Detection	Nov 2, 2024	Audio Source SeparationEvent Detection	CodeCode Available	1
PHNNs: Lightweight Neural Networks via Parameterized Hypercomplex Convolutions	Oct 8, 2021	Sound Event Detection	CodeCode Available	1
AD-YOLO: You Look Only Once in Training Multiple Sound Event Localization and Detection	Mar 28, 2023	Direction of Arrival EstimationSound Event Detection	CodeCode Available	1
Pretraining Representations for Bioacoustic Few-shot Detection using Supervised Contrastive Learning	Sep 2, 2023	Contrastive LearningData Augmentation	CodeCode Available	1
Few-shot bioacoustic event detection at the DCASE 2023 challenge	Jun 15, 2023	Event DetectionFew-Shot Learning	CodeCode Available	1
SALSA: Spatial Cue-Augmented Log-Spectrogram Features for Polyphonic Sound Event Localization and Detection	Oct 1, 2021	Direction of Arrival EstimationEvent Detection	CodeCode Available	1
Self-supervised Audio Teacher-Student Transformer for Both Clip-level and Frame-level Tasks	Jun 7, 2023	Audio ClassificationAudio Tagging	CodeCode Available	1
Self Training and Ensembling Frequency Dependent Networks with Coarse Prediction Pooling and Sound Event Bounding Boxes	Jun 22, 2024	Change DetectionData Augmentation	CodeCode Available	1
A Hybrid System of Sound Event Detection Transformer and Frame-wise Model for DCASE 2022 Task 4	Oct 18, 2022	Event DetectionMetric Learning	CodeCode Available	1
Few-shot bioacoustic event detection at the DCASE 2022 challenge	Jul 14, 2022	Event DetectionSound Event Detection	CodeCode Available	1
Sound Event Localization and Detection for Real Spatial Sound Scenes: Event-Independent Network and Data Augmentation Chains	Sep 5, 2022	Data AugmentationDirection of Arrival Estimation	CodeCode Available	1
The impact of non-target events in synthetic soundscapes for sound event detection	Sep 28, 2021	Event DetectionSound Event Detection	CodeCode Available	1
Frequency Dependent Sound Event Detection for DCASE 2022 Challenge Task 4	Jun 23, 2022	Event DetectionSound Event Detection	CodeCode Available	1
Forward-Backward Convolutional Recurrent Neural Networks and Tag-Conditioned Convolutional Neural Networks for Weakly Labeled Semi-supervised Sound Event Detection	Mar 11, 2021	Event DetectionSound Event Detection	CodeCode Available	1
Frequency Dynamic Convolution: Frequency-Adaptive Pattern Recognition for Sound Event Detection	Mar 29, 2022	Event DetectionSound Event Detection	CodeCode Available	1
Couple Learning for semi-supervised sound event detection	Oct 12, 2021	Event DetectionSound Event Detection	CodeCode Available	1
A dataset for Audio-Visual Sound Event Detection in Movies	Feb 14, 2023	Event DetectionSelf-Driving Cars	CodeCode Available	1
DCASE 2021 Task 3: Spectrotemporally-aligned Features for Polyphonic Sound Event Localization and Detection	Jun 29, 2021	Audio ClassificationDirection of Arrival Estimation	CodeCode Available	1
FilterAugment: An Acoustic Environmental Data Augmentation Method	Oct 7, 2021	Data AugmentationEvent Detection	CodeCode Available	1
DCASENET: A joint pre-trained deep neural network for detecting and classifying acoustic scenes and events	Sep 21, 2020	Acoustic Scene ClassificationAudio Tagging	CodeCode Available	1
Exploring Text-Queried Sound Event Detection with Audio Source Separation	Sep 20, 2024	Audio Source SeparationEvent Detection	CodeCode Available	1
DENet: a deep architecture for audio surveillance applications	Jan 11, 2021	DenoisingSound Event Detection	CodeCode Available	1
Fine-tune the pretrained ATST model for sound event detection	Sep 15, 2023	Event DetectionSelf-Supervised Learning	CodeCode Available	1
Full-frequency dynamic convolution: a physical frequency-dependent convolution for sound event detection	Jan 10, 2024	Event DetectionSound Event Detection	CodeCode Available	1
Improving weakly supervised sound event detection with self-supervised auxiliary tasks	Jun 12, 2021	DecoderEvent Detection	CodeCode Available	1
Diversifying and Expanding Frequency-Adaptive Convolution Kernels for Sound Event Detection	Jun 8, 2024	Event DetectionSound Event Detection	CodeCode Available	1
Multi-Task Learning for Interpretable Weakly Labelled Sound Event Detection	Aug 17, 2020	Event DetectionMultiple Instance Learning	CodeCode Available	1
Post-Processing Independent Evaluation of Sound Event Detection Systems	Jun 27, 2023	Event DetectionSound Event Detection	CodeCode Available	1
ACCDOA: Activity-Coupled Cartesian Direction of Arrival Representation for Sound Event Localization and Detection	Oct 29, 2020	Event DetectionSound Event Detection	CodeCode Available	1
Effective Pre-Training of Audio Transformers for Sound Event Detection	Sep 14, 2024	Data AugmentationEvent Detection	CodeCode Available	1
Revisiting Deep Audio-Text Retrieval Through the Lens of Transportation	May 16, 2024	AudioCapsEvent Detection	CodeCode Available	1
RCT: Random Consistency Training for Semi-supervised Sound Event Detection	Oct 21, 2021	Data AugmentationEvent Detection	CodeCode Available	1

Show:10 25 50

← PrevPage 1 of 4Next →

All datasets DESED L3DAS21 WildDESED Mivia Audio Events Mivia Road Events

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ATST-SED	event-based F1 score	63.4	—	Unverified
2	SE-CRNN-16 with DualKD	event-based F1 score	55.6	—	Unverified
3	FDY-CRNN	event-based F1 score	54	—	Unverified
4	HTS-AT	event-based F1 score	50.7	—	Unverified
5	RCT	event-based F1 score	49.62	—	Unverified
6	FiltAug SED	event-based F1 score	49.6	—	Unverified
7	SED-SSep baseline dcase task 4 2020 v2	event-based F1 score	40.7	—	Unverified
8	Baseline dcase task 4 2020 v2	event-based F1 score	39	—	Unverified
9	Baseline	event-based F1 score	25.8	—	Unverified
10	MAT-SED	PSDS1	0.59	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	PHC SEDnet n=8	Error Rate	0.56	—	Unverified
2	Quaternion SEDnet	Error Rate	0.52	—	Unverified
3	PHC SEDnet n=16	Error Rate	0.51	—	Unverified
4	PHC SEDnet n=4	Error Rate	0.45	—	Unverified
5	PHC SEDnet n=2	Error Rate	0.39	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CRNN (with BEATs + Separation)	PSDS1 (-5dB)	0.13	—	Unverified
2	CRNN (with BEATs)	PSDS1 (-5dB)	0.07	—	Unverified
3	CRNN (WildDESED + Curriculrm learning)	PSDS1 (-5dB)	0.05	—	Unverified
4	CRNN (WildDESED)	PSDS1 (-5dB)	0.05	—	Unverified
5	CRNN	PSDS1 (-5dB)	0.02	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DENet	Rank-1 Recognition Rate	0.98	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	DENet	Rank-1 Recognition Rate	1	—	Unverified