SOTAVerified

Music Source Separation

Music source separation is the task of decomposing music into its constitutive components, e. g., yielding separated stems for the vocals, bass, and drums.

( Image credit: SigSep )

Papers

Showing 150 of 107 papers

TitleStatusHype
Music Source RestorationCode1
Training-Free Multi-Step Audio Source SeparationCode2
Is MixIT Really Unsuitable for Correlated Sources? Exploring MixIT for Unsupervised Pre-training in Music Source Separation0
Solving Copyright Infringement on Short Video Platforms: Novel Datasets and an Audio Restoration Deep Learning Pipeline0
Score-informed Music Source Separation: Improving Synthetic-to-real Generalization in Classical MusicCode0
Separate This, and All of these Things Around It: Music Source Separation via Hyperellipsoidal Queries0
Sanidha: A Studio Quality Multi-Modal Dataset for Carnatic Music0
MAJL: A Model-Agnostic Joint Learning Framework for Music Source Separation and Pitch Estimation0
Learned Compression for Compressed LearningCode0
Music Foundation Model as Generic Booster for Music Downstream Tasks0
Task-Aware Unified Source Separation0
An Ensemble Approach to Music Source Separation: A Comparative Analysis of Conventional and Hierarchical Stem Separation0
SynthSOD: Developing an Heterogeneous Dataset for Orchestra Music Source SeparationCode1
Improving Real-Time Music Accompaniment Separation with MMDenseNet0
A Stem-Agnostic Single-Decoder System for Music Source Separation Beyond Four StemsCode2
Why does music source separation benefit from cacophony?0
Real-time Low-latency Music Source Separation using Hybrid Spectrogram-TasNet0
A fully differentiable model for unsupervised singing voice separationCode1
SCNet: Sparse Compression Network for Music Source SeparationCode2
Resource-constrained stereo singing voice cancellation0
Machine Perceptual Quality: Evaluating the Impact of Severe Lossy Compression on Audio and Image ModelsCode0
Subnetwork-to-go: Elastic Neural Network with Dynamic Training and Customizable Inference0
Pre-training Music Classification Models via Music Source SeparationCode2
Pre-trained Spatial Priors on Multichannel NMF for Music Source Separation0
MBTFNet: Multi-Band Temporal-Frequency Neural Network For Singing Voice Enhancement0
Music Source Separation Based on a Lightweight Deep Learning Framework (DTTNET: DUAL-PATH TFC-TDF UNET)Code2
Contrastive Learning based Deep Latent Masking for Music Source Separation0
The Sound Demixing Challenge 2023 x2013 Music Demixing TrackCode2
Self-refining of Pseudo Labels for Music Source Separation with Noisy Labeled Data0
Sound Demixing Challenge 2023 Music Demixing Track Technical Report: TFC-TDF-UNet v3Code1
Quantifying Spatial Audio Quality ImpairmentCode1
The Whole Is Greater than the Sum of Its Parts: Improving Music Source Separation by Bridging NetworkCode4
Pac-HuBERT: Self-Supervised Music Source Separation via Primitive Auditory Clustering and Hidden-Unit BERT0
Hybrid Y-Net Architecture for Singing Voice Separation0
Jointist: Simultaneous Improvement of Multi-instrument Transcription and Music Source Separation via Joint Training0
Hybrid Transformers for Music Source SeparationCode5
An Efficient Short-Time Discrete Cosine Transform and Attentive MultiResUNet Framework for Music Source SeparationCode1
MedleyVox: An Evaluation Dataset for Multiple Singing Voices SeparationCode1
Music Mixing Style Transfer: A Contrastive Learning Approach to Disentangle Audio EffectsCode1
Music Source Separation with Band-split RNNCode1
Multi-scale temporal-frequency attention for music source separation0
Music Separation Enhancement with Generative Modeling0
Hierarchic Temporal Convolutional Network With Cross-Domain Encoder for Music Source Separation0
0/1 Deep Neural Networks via Block Coordinate Descent0
Music Source Separation with Generative FlowCode1
Low Latency Time Domain Multichannel Speech and Music Source SeparationCode0
VocaLiST: An Audio-Visual Synchronisation Model for Lips and VoicesCode1
Feature-informed Latent Space Regularization for Music Source Separation0
On loss functions and evaluation metrics for music source separation0
SpaIn-Net: Spatially-Informed Stereophonic Music Source Separation0
Show:102550
← PrevPage 1 of 3Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Sparse HT Demucs (fine tuned)SDR (avg)9.2Unverified
2Hybrid Transformer Demucs (f.t.)SDR (avg)9Unverified
3Band-Split RNN (semi-sup.)SDR (avg)8.97Unverified
4TFC-TDF-UNet (v3)SDR (avg)8.34Unverified
5Band-Split RNNSDR (avg)8.23Unverified
6Hybrid DemucsSDR (avg)7.72Unverified
7KUIELab-MDX-NetSDR (avg)7.54Unverified
8CDE-HTCNSDR (avg)6.89Unverified
9Attentive-MultiResUNetSDR (avg)6.81Unverified
10DEMUCS (extra)SDR (avg)6.79Unverified
#ModelMetricClaimedVerifiedStatus
1BS-RoFormer (L=12, OA)SDR (avg)11.99Unverified
2BS-RoFormer (L=6, OA)SDR (avg)9.8Unverified
3SCNet-largeSDR (avg)9.69Unverified
4Sparse HT Demucs (fine tuned)SDR (avg)9.2Unverified
5SCNetSDR (avg)9Unverified
6Hybrid Transformer Demucs (f.t.)SDR (avg)9Unverified
7Band-Split RNN (semi-sup.)SDR (avg)8.97Unverified
8TFC-TDF-UNet (v3)SDR (avg)8.34Unverified
9Band-Split RNNSDR (avg)8.24Unverified
10Dual-Path TFC-TDF UNet (DTTNet)SDR (avg)8.15Unverified
#ModelMetricClaimedVerifiedStatus
1DiCoSe (Deterministic)SI-SDRi (Bass)20.04Unverified
2LQ-VAE + Scalable TransformerSDR (bass)7.42Unverified