SOTAVerified|Agents Browse Leaderboard About Blog

Target Sound Extraction

Target Sound Extraction is the task of extracting a sound corresponding to a given class from an audio mixture. The audio mixture may contain background noise with a relatively low amplitude compared to the foreground mixture components. The choice of the sound class is provided as input to the model in form of a string, integer, or a one-hot encoding of the sound class.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 11–16 of 16 papers

Title	Date	Tasks	Status
Leveraging Audio-Only Data for Text-Queried Target Sound Extraction	Sep 20, 2024	Target Sound Extraction	—Unverified
Language-Queried Target Sound Extraction Without Parallel Training Data	Sep 14, 2024	Language ModellingLarge Language Model	—Unverified
Few-shot learning of new sound classes for target sound extraction	Jun 14, 2021	Few-Shot LearningTarget Sound Extraction	—Unverified
SoundBeam: Target sound extraction conditioned on sound-class labels and enrollment clues for increased performance and continuous learning	Apr 8, 2022	Target Sound Extraction	—Unverified
SoundSculpt: Direction and Semantics Driven Ambisonic Target Sound Extraction	May 30, 2025	Image SegmentationSemantic Segmentation	—Unverified
CATSE: A Context-Aware Framework for Causal Target Sound Extraction	Mar 21, 2024	Target Sound Extraction	—Unverified

Show:10 25 50

← PrevPage 2 of 2Next →

All datasets AudioCaps AudioSet FSDSoundScapes

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	CLAPSep	SDRi	10.08	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CLAPSep	SDRi	9.29	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Waveformer	SI-SNRi	9.43	—	Unverified