Target Sound Extraction

Target Sound Extraction is the task of extracting a sound corresponding to a given class from an audio mixture. The audio mixture may contain background noise with a relatively low amplitude compared to the foreground mixture components. The choice of the sound class is provided as input to the model in form of a string, integer, or a one-hot encoding of the sound class.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–10 of 16 papers

Title	Date	Tasks	Status	Hype
SoundSculpt: Direction and Semantics Driven Ambisonic Target Sound Extraction	May 30, 2025	Image SegmentationSemantic Segmentation	—Unverified	0
Leveraging Audio-Only Data for Text-Queried Target Sound Extraction	Sep 20, 2024	Target Sound Extraction	—Unverified	0
Multichannel-to-Multichannel Target Sound Extraction Using Direction and Timestamp Clues	Sep 19, 2024	Inductive BiasTarget Sound Extraction	—Unverified	0
Language-Queried Target Sound Extraction Without Parallel Training Data	Sep 14, 2024	Language ModellingLarge Language Model	—Unverified	0
SoloAudio: Target Sound Extraction with Language-oriented Audio Diffusion Transformer	Sep 12, 2024	Target Sound Extraction	CodeCode Available	2
Cross-attention Inspired Selective State Space Models for Target Sound Extraction	Sep 7, 2024	Computational EfficiencyMamba	CodeCode Available	1
Can all variations within the unified mask-based beamformer framework achieve identical peak extraction performance?	Jul 22, 2024	AllTarget Sound Extraction	CodeCode Available	0
CATSE: A Context-Aware Framework for Causal Target Sound Extraction	Mar 21, 2024	Target Sound Extraction	—Unverified	0
CLAPSep: Leveraging Contrastive Pre-trained Model for Multi-Modal Query-Conditioned Target Sound Extraction	Feb 27, 2024	Target Sound Extraction	CodeCode Available	1
Online Similarity-and-Independence-Aware Beamformer for Low-latency Target Sound Extraction	Dec 27, 2023	blind source separationTarget Sound Extraction	—Unverified	0

Show:10 25 50

← PrevPage 1 of 2Next →

All datasets AudioCaps AudioSet FSDSoundScapes

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	CLAPSep	SDRi	10.08	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CLAPSep	SDRi	9.29	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Waveformer	SI-SNRi	9.43	—	Unverified