SOTAVerified

Speech Separation

The task of extracting all overlapping speech sources in a given mixed speech signal refers to the Speech Separation. Speech Separation is a special scenario of source separation problem, where the focus is only on the overlapping speech signal sources and other interferences such as music or noise signals are not the main concern of the study. A recent representative Github project can be referred to ClearerVoice-Studio.

Source: A Unified Framework for Speech Separation

Image credit: Speech Separation of A Target Speaker Based on Deep Neural Networks

Papers

Showing 110 of 359 papers

TitleStatusHype
Dynamic Slimmable Networks for Efficient Speech Separation0
Improving Practical Aspects of End-to-End Multi-Talker Speech Recognition for Online and Offline Scenarios0
SoloSpeech: Enhancing Intelligibility and Quality in Target Speech Extraction through a Cascaded Generative PipelineCode3
Attractor-Based Speech Separation of Multiple Utterances by Unknown Number of Speakers0
Single-Channel Target Speech Extraction Utilizing Distance and Room Clues0
Time-Frequency-Based Attention Cache Memory Model for Real-Time Speech Separation0
SepPrune: Structured Pruning for Efficient Deep Speech SeparationCode1
A Survey of Deep Learning for Complex Speech Spectrograms0
ArrayDPS: Unsupervised Blind Speech Separation with a Diffusion PriorCode1
SwinLip: An Efficient Visual Speech Encoder for Lip Reading Using Swin Transformer0
Show:102550
← PrevPage 1 of 36Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1IIANetSI-SNRi16.4Unverified
2TDFNet-largeSI-SNRi15.8Unverified
3TDFNet (MHSA + Shared)SI-SNRi15Unverified
4RTFS-Net-12SI-SNRi14.9Unverified
5RTFS-Net-6SI-SNRi14.6Unverified
6CTCNetSI-SNRi14.3Unverified
7RTFS-Net-4SI-SNRi14.1Unverified
8TDFNet-smallSI-SNRi13.6Unverified