SOTAVerified

Speech Separation

The task of extracting all overlapping speech sources in a given mixed speech signal refers to the Speech Separation. Speech Separation is a special scenario of source separation problem, where the focus is only on the overlapping speech signal sources and other interferences such as music or noise signals are not the main concern of the study. A recent representative Github project can be referred to ClearerVoice-Studio.

Source: A Unified Framework for Speech Separation

Image credit: Speech Separation of A Target Speaker Based on Deep Neural Networks

Papers

Showing 201250 of 359 papers

TitleStatusHype
Deep AHS: A Deep Learning Approach to Acoustic Howling Suppression0
Deep Attention Fusion Feature for Speech Separation with End-to-End Post-filter Method0
Deep Clustering and Conventional Networks for Music Separation: Stronger Together0
Deep Learning for Joint Acoustic Echo and Acoustic Howling Suppression in Hybrid Meetings0
Deep Neural Mel-Subband Beamformer for In-car Speech Separation0
Deep neural network Based Low-latency Speech Separation with Asymmetric analysis-Synthesis Window Pair0
Deep neural network techniques for monaural speech enhancement: state of the art analysis0
Deep Variational Generative Models for Audio-visual Speech Separation0
Demystifying TasNet: A Dissecting Approach0
Diffusion-based Signal Refiner for Speech Separation0
Directed Speech Separation for Automatic Speech Recognition of Long Form Conversational Speech0
Discretization and Re-synthesis: an alternative method to solve the Cocktail Party Problem0
Discriminative Learning for Monaural Speech Separation Using Deep Embedding Features0
DNN driven Speaker Independent Audio-Visual Mask Estimation for Speech Separation0
Dual-Path Cross-Modal Attention for better Audio-Visual Speech Extraction0
Dual-Path Filter Network: Speaker-Aware Modeling for Speech Separation0
Dual-Path Modeling for Long Recording Speech Separation in Meetings0
DualSep: A Light-weight dual-encoder convolutional recurrent network for real-time in-car speech separation0
Dynamic Slimmable Networks for Efficient Speech Separation0
EDSep: An Effective Diffusion-Based Method for Speech Source Separation0
EEG-informed attended speaker extraction from recorded speech mixtures with application in neuro-steered hearing prostheses0
Effects of Dataset Sampling Rate for Noise Cancellation through Deep Learning0
Efficient Integration of Multi-channel Information for Speaker-independent Speech Separation0
Efficient Transformer-based Speech Enhancement Using Long Frames and STFT Magnitudes0
Embedding Recurrent Layers with Dual-Path Strategy in a Variant of Convolutional Network for Speaker-Independent Speech Separation0
Endpoint Detection for Streaming End-to-End Multi-talker ASR0
Spatially Selective Deep Non-linear Filters for Speaker Extraction0
Speakerfilter-Pro: an improved target speaker extractor combines the time domain and frequency domain0
Speaker-independent Speech Separation with Deep Attractor Network0
Speech enhancement aided end-to-end multi-task learning for voice activity detection0
Speech Separation based on Contrastive Learning and Deep Modularization0
Speech Separation using Neural Audio Codecs with Embedding Loss0
Speech separation with large-scale self-supervised learning0
STCON System for the CHiME-8 Challenge0
Stepwise-Refining Speech Separation Network via Fine-Grained Encoding in High-order Latent Domain0
Streaming Target-Speaker ASR with Neural Transducer0
Streaming Multi-talker Speech Recognition with Joint Speaker Identification0
Study of the Performance of CEEMDAN in Underdetermined Speech Separation0
Supervised Speech Separation Based on Deep Learning: An Overview0
Surrogate Source Model Learning for Determined Source Separation0
SwinLip: An Efficient Visual Speech Encoder for Lip Reading Using Swin Transformer0
TalTech-IRIT-LIS Speaker and Language Diarization Systems for DISPLACE 20240
Target Confusion in End-to-end Speaker Extraction: Analysis and Approaches0
Task-Aware Unified Source Separation0
Teacher-Student MixIT for Unsupervised and Semi-supervised Speech Separation0
Temporal-Spatial Neural Filter: Direction Informed End-to-End Multi-channel Target Speech Separation0
Tensor-Train Long Short-Term Memory for Monaural Speech Enhancement0
The fifth 'CHiME' Speech Separation and Recognition Challenge: Dataset, task and baselines0
The RoyalFlush System of Speech Recognition for M2MeT Challenge0
TIGER: Time-frequency Interleaved Gain Extraction and Reconstruction for Efficient Speech Separation0
Show:102550
← PrevPage 5 of 8Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1TF-Locoformer (L) + DMSI-SDRi25.1Unverified
2SepReformer-LSI-SDRi25.1Unverified
3TF-Locoformer (M) + DMSI-SDRi24.6Unverified
4TF-Locoformer (L)SI-SDRi24.2Unverified
5MossFormer2 (L)SI-SDRi24.1Unverified
6SepTDA (L=12)SI-SDRi24Unverified
7Separate And DiffuseSI-SDRi23.9Unverified
8TF-Locoformer (M)SI-SDRi23.6Unverified
9MossFormer (L) + DMSI-SDRi22.8Unverified
10TF-Locoformer (S) + DMSI-SDRi22.8Unverified
#ModelMetricClaimedVerifiedStatus
1TF-Locoformer (M)SI-SDRi18.5Unverified
2TF-Locoformer (S)SI-SDRi17.4Unverified
3SepReformer-L + DMSI-SDRi17.1Unverified
4MossFormer2SI-SDRi17Unverified
5MossFormer (L) + DMSI-SDRi16.3Unverified
6TD-Conformer (XL) + DMSI-SDRi14.6Unverified
7Improved Sudo rm -rf (U=36)SI-SDRi13.5Unverified
8TD-Conformer (L) + DMSI-SDRi13.4Unverified
9WavesplitSI-SDRi13.2Unverified
10DPTNET - SRSSNSI-SDRi12.3Unverified
#ModelMetricClaimedVerifiedStatus
1MossFormer2 (w speed perturb)SI-SDRi22.2Unverified
2TF-Locoformer (M)SI-SDRi22.1Unverified
3MossFormer2 (w/o DM)SI-SDRi21.7Unverified
4Separate And DiffuseSI-SDRi21.5Unverified
5WHYVSI-SDRi17.5Unverified
6TDANet LargeSI-SDRi17.4Unverified
7TDANetSI-SDRi16.9Unverified
8Conv-Tasnet (Libri1Mix speech enhancement pre-trained)SI-SDRi14.1Unverified
9Conv-Tasnet (Libri1Mix speech enhancement multi-task)SI-SDRi13.7Unverified
10Conv-TasnetSI-SDRi13.2Unverified
#ModelMetricClaimedVerifiedStatus
1SepTDASI-SDRi23.7Unverified
2MossFormer2SI-SDRi22.2Unverified
3MossFormer (L) + DMSI-SDRi21.2Unverified
4Separate And DiffuseSI-SDRi20.9Unverified
5MossFormer (M) + DMSI-SDRi20.8Unverified
6SepItSI-SDRi20.1Unverified
7SepFormerSI-SDRi19.5Unverified
8SandglassetSI-SDRi17.1Unverified
9Gated DualPathRNNSI-SDRi16.85Unverified
#ModelMetricClaimedVerifiedStatus
1IIANetSI-SNRi16.4Unverified
2TDFNet-largeSI-SNRi15.8Unverified
3TDFNet (MHSA + Shared)SI-SNRi15Unverified
4RTFS-Net-12SI-SNRi14.9Unverified
5RTFS-Net-6SI-SNRi14.6Unverified
6CTCNetSI-SNRi14.3Unverified
7RTFS-Net-4SI-SNRi14.1Unverified
8TDFNet-smallSI-SNRi13.6Unverified
#ModelMetricClaimedVerifiedStatus
1SepReformer-L + DMSI-SDRi18.4Unverified
2MossFormer2SI-SDRi18.1Unverified
3MossFormer (L) + DMSI-SDRi17.3Unverified
4TDANet LargeSI-SDRi15.2Unverified
5TDANetSI-SDRi14.8Unverified
6WHYVSI-SDRi12.96Unverified
#ModelMetricClaimedVerifiedStatus
1SepTDASI-SDRi21Unverified
2Hungarian PITSI-SDRi13.22Unverified
3Conditional TasNetSI-SDRi11.7Unverified
4TasTasSI-SDRi11.14Unverified
5Gated DualPathRNNSI-SDRi10.56Unverified
6Multi-Decoder DPRNNSI-SDRi5.9Unverified
#ModelMetricClaimedVerifiedStatus
1IIANetSI-SNRi18.3Unverified
2RTFS-Net-12SI-SNRi17.5Unverified
3CTCNetSI-SNRi17.4Unverified
4RTFS-Net-6SI-SNRi16.9Unverified
5RTFS-Net-4SI-SNRi15.5Unverified
#ModelMetricClaimedVerifiedStatus
1IIANetSI-SNRi14Unverified
2RTFS-Net-12SI-SNRi12.4Unverified
3CTCNetSI-SNRi11.9Unverified
4RTFS-Net-6SI-SNRi11.8Unverified
5RTFS-Net-4SI-SNRi11.5Unverified
#ModelMetricClaimedVerifiedStatus
1SepTDASI-SDRi22Unverified
2Gated DualPathRNNSI-SDRi12.88Unverified
3Conditional TasNetSI-SDRi12.5Unverified
4OR-PITSI-SDRi10.2Unverified
5Multi-Decoder DPRNNSI-SDRi9.3Unverified
#ModelMetricClaimedVerifiedStatus
1Separate And DiffuseSI-SDRi14.2Unverified
2SepItSI-SDRi13.7Unverified
3OCDSI-SDRi13.4Unverified
4Hungarian PITSI-SDRi12.72Unverified
#ModelMetricClaimedVerifiedStatus
1Separate And DiffuseSI-SDRi9Unverified
2SepItSI-SDRi8.2Unverified
3Hungarian PITSI-SDRi7.78Unverified
#ModelMetricClaimedVerifiedStatus
1SDR9.6Unverified
2Audio-Visual concat-refSDR8.05Unverified
#ModelMetricClaimedVerifiedStatus
1Separate And DiffuseSI-SDRi5.2Unverified
2Hungarian PITSI-SDRi4.26Unverified
#ModelMetricClaimedVerifiedStatus
1Conformer (base)0S5.6Unverified
2Conformer (large)0S5.4Unverified
#ModelMetricClaimedVerifiedStatus
1Hungarian PITSI-SDRi5.66Unverified
#ModelMetricClaimedVerifiedStatus
1Audio-Visual concat-refSDR10.55Unverified
#ModelMetricClaimedVerifiedStatus
1MossFormer2SI-SDRi20.5Unverified