Music Source Separation
Music source separation is the task of decomposing music into its constitutive components, e. g., yielding separated stems for the vocals, bass, and drums.
( Image credit: SigSep )
Papers
Showing 1–10 of 107 papers
Benchmark Results
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | Sparse HT Demucs (fine tuned) | SDR (avg) | 9.2 | — | Unverified |
| 2 | Hybrid Transformer Demucs (f.t.) | SDR (avg) | 9 | — | Unverified |
| 3 | Band-Split RNN (semi-sup.) | SDR (avg) | 8.97 | — | Unverified |
| 4 | TFC-TDF-UNet (v3) | SDR (avg) | 8.34 | — | Unverified |
| 5 | Band-Split RNN | SDR (avg) | 8.23 | — | Unverified |
| 6 | Hybrid Demucs | SDR (avg) | 7.72 | — | Unverified |
| 7 | KUIELab-MDX-Net | SDR (avg) | 7.54 | — | Unverified |
| 8 | CDE-HTCN | SDR (avg) | 6.89 | — | Unverified |
| 9 | Attentive-MultiResUNet | SDR (avg) | 6.81 | — | Unverified |
| 10 | DEMUCS (extra) | SDR (avg) | 6.79 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | BS-RoFormer (L=12, OA) | SDR (avg) | 11.99 | — | Unverified |
| 2 | BS-RoFormer (L=6, OA) | SDR (avg) | 9.8 | — | Unverified |
| 3 | SCNet-large | SDR (avg) | 9.69 | — | Unverified |
| 4 | Sparse HT Demucs (fine tuned) | SDR (avg) | 9.2 | — | Unverified |
| 5 | SCNet | SDR (avg) | 9 | — | Unverified |
| 6 | Hybrid Transformer Demucs (f.t.) | SDR (avg) | 9 | — | Unverified |
| 7 | Band-Split RNN (semi-sup.) | SDR (avg) | 8.97 | — | Unverified |
| 8 | TFC-TDF-UNet (v3) | SDR (avg) | 8.34 | — | Unverified |
| 9 | Band-Split RNN | SDR (avg) | 8.24 | — | Unverified |
| 10 | Dual-Path TFC-TDF UNet (DTTNet) | SDR (avg) | 8.15 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | DiCoSe (Deterministic) | SI-SDRi (Bass) | 20.04 | — | Unverified |
| 2 | LQ-VAE + Scalable Transformer | SDR (bass) | 7.42 | — | Unverified |