Speech Separation
The task of extracting all overlapping speech sources in a given mixed speech signal refers to the Speech Separation. Speech Separation is a special scenario of source separation problem, where the focus is only on the overlapping speech signal sources and other interferences such as music or noise signals are not the main concern of the study. A recent representative Github project can be referred to ClearerVoice-Studio.
Source: A Unified Framework for Speech Separation
Image credit: Speech Separation of A Target Speaker Based on Deep Neural Networks
Papers
Showing 1–10 of 359 papers
All datasetsWSJ0-2mixWHAMR!Libri2MixWSJ0-3mixLRS2WHAM!WSJ0-5mixLRS3VoxCeleb2WSJ0-4mixLibri5MixLibri10Mix
Benchmark Results
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | SepTDA | SI-SDRi | 21 | — | Unverified |
| 2 | Hungarian PIT | SI-SDRi | 13.22 | — | Unverified |
| 3 | Conditional TasNet | SI-SDRi | 11.7 | — | Unverified |
| 4 | TasTas | SI-SDRi | 11.14 | — | Unverified |
| 5 | Gated DualPathRNN | SI-SDRi | 10.56 | — | Unverified |
| 6 | Multi-Decoder DPRNN | SI-SDRi | 5.9 | — | Unverified |