Speech Separation
The task of extracting all overlapping speech sources in a given mixed speech signal refers to the Speech Separation. Speech Separation is a special scenario of source separation problem, where the focus is only on the overlapping speech signal sources and other interferences such as music or noise signals are not the main concern of the study. A recent representative Github project can be referred to ClearerVoice-Studio.
Source: A Unified Framework for Speech Separation
Image credit: Speech Separation of A Target Speaker Based on Deep Neural Networks
Papers
Showing 1–10 of 359 papers
All datasetsWSJ0-2mixWHAMR!Libri2MixWSJ0-3mixLRS2WHAM!WSJ0-5mixLRS3VoxCeleb2WSJ0-4mixLibri5MixLibri10Mix
Benchmark Results
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | TF-Locoformer (L) + DM | SI-SDRi | 25.1 | — | Unverified |
| 2 | SepReformer-L | SI-SDRi | 25.1 | — | Unverified |
| 3 | TF-Locoformer (M) + DM | SI-SDRi | 24.6 | — | Unverified |
| 4 | TF-Locoformer (L) | SI-SDRi | 24.2 | — | Unverified |
| 5 | MossFormer2 (L) | SI-SDRi | 24.1 | — | Unverified |
| 6 | SepTDA (L=12) | SI-SDRi | 24 | — | Unverified |
| 7 | Separate And Diffuse | SI-SDRi | 23.9 | — | Unverified |
| 8 | TF-Locoformer (M) | SI-SDRi | 23.6 | — | Unverified |
| 9 | MossFormer (L) + DM | SI-SDRi | 22.8 | — | Unverified |
| 10 | TF-Locoformer (S) + DM | SI-SDRi | 22.8 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | TF-Locoformer (M) | SI-SDRi | 18.5 | — | Unverified |
| 2 | TF-Locoformer (S) | SI-SDRi | 17.4 | — | Unverified |
| 3 | SepReformer-L + DM | SI-SDRi | 17.1 | — | Unverified |
| 4 | MossFormer2 | SI-SDRi | 17 | — | Unverified |
| 5 | MossFormer (L) + DM | SI-SDRi | 16.3 | — | Unverified |
| 6 | TD-Conformer (XL) + DM | SI-SDRi | 14.6 | — | Unverified |
| 7 | Improved Sudo rm -rf (U=36) | SI-SDRi | 13.5 | — | Unverified |
| 8 | TD-Conformer (L) + DM | SI-SDRi | 13.4 | — | Unverified |
| 9 | Wavesplit | SI-SDRi | 13.2 | — | Unverified |
| 10 | DPTNET - SRSSN | SI-SDRi | 12.3 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | MossFormer2 (w speed perturb) | SI-SDRi | 22.2 | — | Unverified |
| 2 | TF-Locoformer (M) | SI-SDRi | 22.1 | — | Unverified |
| 3 | MossFormer2 (w/o DM) | SI-SDRi | 21.7 | — | Unverified |
| 4 | Separate And Diffuse | SI-SDRi | 21.5 | — | Unverified |
| 5 | WHYV | SI-SDRi | 17.5 | — | Unverified |
| 6 | TDANet Large | SI-SDRi | 17.4 | — | Unverified |
| 7 | TDANet | SI-SDRi | 16.9 | — | Unverified |
| 8 | Conv-Tasnet (Libri1Mix speech enhancement pre-trained) | SI-SDRi | 14.1 | — | Unverified |
| 9 | Conv-Tasnet (Libri1Mix speech enhancement multi-task) | SI-SDRi | 13.7 | — | Unverified |
| 10 | Conv-Tasnet | SI-SDRi | 13.2 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | SepTDA | SI-SDRi | 23.7 | — | Unverified |
| 2 | MossFormer2 | SI-SDRi | 22.2 | — | Unverified |
| 3 | MossFormer (L) + DM | SI-SDRi | 21.2 | — | Unverified |
| 4 | Separate And Diffuse | SI-SDRi | 20.9 | — | Unverified |
| 5 | MossFormer (M) + DM | SI-SDRi | 20.8 | — | Unverified |
| 6 | SepIt | SI-SDRi | 20.1 | — | Unverified |
| 7 | SepFormer | SI-SDRi | 19.5 | — | Unverified |
| 8 | Sandglasset | SI-SDRi | 17.1 | — | Unverified |
| 9 | Gated DualPathRNN | SI-SDRi | 16.85 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | IIANet | SI-SNRi | 16.4 | — | Unverified |
| 2 | TDFNet-large | SI-SNRi | 15.8 | — | Unverified |
| 3 | TDFNet (MHSA + Shared) | SI-SNRi | 15 | — | Unverified |
| 4 | RTFS-Net-12 | SI-SNRi | 14.9 | — | Unverified |
| 5 | RTFS-Net-6 | SI-SNRi | 14.6 | — | Unverified |
| 6 | CTCNet | SI-SNRi | 14.3 | — | Unverified |
| 7 | RTFS-Net-4 | SI-SNRi | 14.1 | — | Unverified |
| 8 | TDFNet-small | SI-SNRi | 13.6 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | SepReformer-L + DM | SI-SDRi | 18.4 | — | Unverified |
| 2 | MossFormer2 | SI-SDRi | 18.1 | — | Unverified |
| 3 | MossFormer (L) + DM | SI-SDRi | 17.3 | — | Unverified |
| 4 | TDANet Large | SI-SDRi | 15.2 | — | Unverified |
| 5 | TDANet | SI-SDRi | 14.8 | — | Unverified |
| 6 | WHYV | SI-SDRi | 12.96 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | SepTDA | SI-SDRi | 21 | — | Unverified |
| 2 | Hungarian PIT | SI-SDRi | 13.22 | — | Unverified |
| 3 | Conditional TasNet | SI-SDRi | 11.7 | — | Unverified |
| 4 | TasTas | SI-SDRi | 11.14 | — | Unverified |
| 5 | Gated DualPathRNN | SI-SDRi | 10.56 | — | Unverified |
| 6 | Multi-Decoder DPRNN | SI-SDRi | 5.9 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | IIANet | SI-SNRi | 18.3 | — | Unverified |
| 2 | RTFS-Net-12 | SI-SNRi | 17.5 | — | Unverified |
| 3 | CTCNet | SI-SNRi | 17.4 | — | Unverified |
| 4 | RTFS-Net-6 | SI-SNRi | 16.9 | — | Unverified |
| 5 | RTFS-Net-4 | SI-SNRi | 15.5 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | IIANet | SI-SNRi | 14 | — | Unverified |
| 2 | RTFS-Net-12 | SI-SNRi | 12.4 | — | Unverified |
| 3 | CTCNet | SI-SNRi | 11.9 | — | Unverified |
| 4 | RTFS-Net-6 | SI-SNRi | 11.8 | — | Unverified |
| 5 | RTFS-Net-4 | SI-SNRi | 11.5 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | SepTDA | SI-SDRi | 22 | — | Unverified |
| 2 | Gated DualPathRNN | SI-SDRi | 12.88 | — | Unverified |
| 3 | Conditional TasNet | SI-SDRi | 12.5 | — | Unverified |
| 4 | OR-PIT | SI-SDRi | 10.2 | — | Unverified |
| 5 | Multi-Decoder DPRNN | SI-SDRi | 9.3 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | Separate And Diffuse | SI-SDRi | 14.2 | — | Unverified |
| 2 | SepIt | SI-SDRi | 13.7 | — | Unverified |
| 3 | OCD | SI-SDRi | 13.4 | — | Unverified |
| 4 | Hungarian PIT | SI-SDRi | 12.72 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | Separate And Diffuse | SI-SDRi | 9 | — | Unverified |
| 2 | SepIt | SI-SDRi | 8.2 | — | Unverified |
| 3 | Hungarian PIT | SI-SDRi | 7.78 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | SDR | 9.6 | — | Unverified | |
| 2 | Audio-Visual concat-ref | SDR | 8.05 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | Separate And Diffuse | SI-SDRi | 5.2 | — | Unverified |
| 2 | Hungarian PIT | SI-SDRi | 4.26 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | Conformer (base) | 0S | 5.6 | — | Unverified |
| 2 | Conformer (large) | 0S | 5.4 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | Hungarian PIT | SI-SDRi | 5.66 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | Audio-Visual concat-ref | SDR | 10.55 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | MossFormer2 | SI-SDRi | 20.5 | — | Unverified |