SOTAVerified

Speech Enhancement

Speech Enhancement is a signal processing task that involves improving the quality of speech signals captured under noisy or degraded conditions. The goal of speech enhancement is to make speech signals clearer, more intelligible, and more pleasant to listen to, which can be used for various applications such as voice recognition, teleconferencing, and hearing aids. A representative Github project with online demo : ClearerVoice-Studio.

( Image credit: A Fully Convolutional Neural Network For Speech Enhancement )

Papers

Showing 301350 of 982 papers

TitleStatusHype
Ultra-Low Latency Speech Enhancement - A Comprehensive Study0
Leveraging Joint Spectral and Spatial Learning with MAMBA for Multichannel Speech Enhancement0
Investigating Training Objectives for Generative Speech EnhancementCode0
TCG CREST System Description for the Second DISPLACE Challenge0
Rethinking Mamba in Speech Processing by Self-Supervised Models0
DeWinder: Single-Channel Wind Noise Reduction using Ultrasound Sensing0
TF-Mamba: A Time-Frequency Network for Sound Source Localization0
Diffusion-based Speech Enhancement with Schrödinger Bridge and Symmetric Noise Schedule0
aTENNuate: Optimized Real-time Speech Enhancement with Deep SSMs on Raw Audio0
Effective Noise-aware Data Simulation for Domain-adaptive Speech Enhancement Leveraging Dynamic Stochastic PerturbationCode0
Progressive Residual Extraction based Pre-training for Speech Representation Learning0
Spectral Masking with Explicit Time-Context Windowing for Neural Network-Based Monaural Speech Enhancement0
Dynamic Gated Recurrent Neural Network for Compute-efficient Speech Enhancement0
DPSNN: Spiking Neural Network for Low-Latency Streaming Speech Enhancement0
Heterogeneous Space Fusion and Dual-Dimension Attention: A New Paradigm for Speech Enhancement0
BSS-CFFMA: Cross-Domain Feature Fusion and Multi-Attention Speech Enhancement Network based on Self-Supervised EmbeddingCode0
Direction of Arrival Correction through Speech Quality FeedbackCode0
One-Shot Distributed Node-Specific Signal Estimation with Non-Overlapping Latent Subspaces in Acoustic Sensor Networks0
ctPuLSE: Close-Talk, and Pseudo-Label Based Far-Field, Speech Enhancement0
Speech Bandwidth Expansion Via High Fidelity Generative Adversarial Networks0
Schrödinger Bridge for Generative Speech Enhancement0
Wideband Relative Transfer Function (RTF) Estimation Exploiting Frequency CorrelationsCode0
RT-LA-VocE: Real-Time Low-SNR Audio-Visual Speech Enhancement0
Unsupervised Face-Masked Speech Enhancement Using Generative Adversarial Networks With Human-in-the-Loop Assessment Metrics0
Open-Source Conversational AI with SpeechBrain 1.00
DASB -- Discrete Audio and Speech Benchmark0
Diffusion-based Generative Modeling with Discriminative Guidance for Streamable Speech Enhancement0
Spatially constrained vs. unconstrained filtering in neural spatiospectral filters for multichannel speech enhancement0
An Exploration of Length Generalization in Transformer-Based Speech Enhancement0
Personalized Speech Enhancement Without a Separate Speaker Embedding Model0
FlowAVSE: Efficient Audio-Visual Speech Enhancement with Conditional Flow Matching0
Comparative Analysis of Personalized Voice Activity Detection Systems: Assessing Real-World Effectiveness0
Pre-training Feature Guided Diffusion Model for Speech Enhancement0
The Effect of Training Dataset Size on Discriminative and Diffusion-Based Speech Enhancement Systems0
Thunder : Unified Regression-Diffusion Speech Enhancement with a Single Reverse Step using Brownian Bridge0
An Investigation of Noise Robustness for Flow-Matching-Based Zero-Shot TTS0
URGENT Challenge: Universality, Robustness, and Generalizability For Speech Enhancement0
Helsinki Speech Challenge 20240
Flexible Multichannel Speech Enhancement for Noise-Robust Frontend0
PLDNet: PLD-Guided Lightweight Deep Network Boosted by Efficient Attention for Handheld Dual-Microphone Speech Enhancement0
Reference Channel Selection by Multi-Channel Masking for End-to-End Multi-Channel Speech Enhancement0
The PESQetarian: On the Relevance of Goodhart's Law for Speech Enhancement0
Speech enhancement deep-learning architecture for efficient edge processing0
Non-autoregressive real-time Accent Conversion model with voice cloning0
Monaural speech enhancement on drone via Adapter based transfer learning0
Building a Luganda Text-to-Speech Model From Crowdsourced Data0
Evaluating Speech Enhancement Systems Through Listening Effort0
Real-time multichannel deep speech enhancement in hearing aids: Comparing monaural and binaural processing in complex acoustic scenarios0
TRAMBA: A Hybrid Transformer and Mamba Architecture for Practical Audio and Bone Conduction Speech Super Resolution and Enhancement on Mobile and Wearable Platforms0
Deep low-latency joint speech transmission and enhancement over a gaussian channel0
Show:102550
← PrevPage 7 of 20Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ROSE-CD(PESQ)PESQ (wb)3.99Unverified
2PESQetarianPESQ (wb)3.82Unverified
3Mamba-SEUNet L (+PCS)PESQ (wb)3.73Unverified
4Schrödinger bridge (PESQ loss)PESQ (wb)3.7Unverified
5SEMamba (+PCS)PESQ (wb)3.69Unverified
6ZipEnhancer (S, \lamba_6 = 0)PESQ (wb)3.63Unverified
7PrimeK-NetPESQ (wb)3.61Unverified
8ZipEnhancer (S, \lamba_6 = 0.2)PESQ (wb)3.61Unverified
9MP-SENetPESQ (wb)3.6Unverified
10PCS_CS_WAVLMPESQ (wb)3.54Unverified
#ModelMetricClaimedVerifiedStatus
1BSRNN-S + MGDSI-SDR-WB21.4Unverified
2DTLNSI-SDR-WB16.34Unverified
3Non-Real-Time MultiScale+SI-SDR-WB16.22Unverified
4ZipEnhancer (M)PESQ-WB3.81Unverified
5TF-Locoformer (M)PESQ-WB3.72Unverified
6ZipEnhancer (S)PESQ-WB3.69Unverified
7MambAttentionPESQ-WB3.67Unverified
8MP-SENetPESQ-WB3.62Unverified
9xLSTM-SENetPESQ-WB3.59Unverified
10BSRNN-S + MRSDPESQ-WB3.53Unverified
#ModelMetricClaimedVerifiedStatus
1Inter-Channel Conv-TasNetSDR19.67Unverified
2CA Dense U-Net (Complex)SDR18.64Unverified
3Dense U-Net (Complex)SDR18.4Unverified
4Dense U-Net (Real)SDR16.86Unverified
5U-Net (Real)SDR15.97Unverified
6Noisy/unprocessedSDR6.5Unverified
#ModelMetricClaimedVerifiedStatus
1Schrödinger Bridge (PESQ loss)PESQ-WB3.09Unverified
2SGMSE+PESQ-WB2.5Unverified
3Demucs v4PESQ-WB2.37Unverified
4Schrödinger BridgePESQ-WB2.33Unverified
5Conv-TasNetPESQ-WB2.31Unverified
6CDiffuSEPESQ-WB1.6Unverified
#ModelMetricClaimedVerifiedStatus
1ReVISE (ch2)Audio Quality MOS4.19Unverified
2ReVISE (bf)Audio Quality MOS4.11Unverified
3Demucs (ch2)Audio Quality MOS2.95Unverified
4Demucs (bf)Audio Quality MOS2.39Unverified
5MaxDI (Baseline)PESQ1.17Unverified
6DAJA (MVDR,HMA,1000) (Overlapped Speech)SDR-4.76Unverified
#ModelMetricClaimedVerifiedStatus
1ZipEnhancer (M)PESQ-NB4.08Unverified
2DCCRN-MCPESQ-NB3.21Unverified
3DCCRN-MPESQ-NB3.15Unverified
4DCCRNPESQ-NB3.04Unverified
5RNN-ModulationPESQ-WB2.75Unverified
#ModelMetricClaimedVerifiedStatus
1MambAttentionESTOI0.8Unverified
2SEMambaESTOI0.8Unverified
3xLSTM-SENetESTOI0.8Unverified
4MP-SENetESTOI0.79Unverified
#ModelMetricClaimedVerifiedStatus
1SepFormerPESQ2.84Unverified
2DTLNPESQ2.23Unverified
3UnprocessedPESQ1.83Unverified
4Non-Real-Time MultiScale+PESQ1.52Unverified
#ModelMetricClaimedVerifiedStatus
1DCUNet-MCPESQ-NB3.44Unverified
2DCCRN-MPESQ-NB3.28Unverified
3DCUNetPESQ-NB3.25Unverified
#ModelMetricClaimedVerifiedStatus
1CleanMel-L-mapDNSMOS3.82Unverified
2SpatialNetDNSMOS BAK3.43Unverified
#ModelMetricClaimedVerifiedStatus
1rose_cd(PESQ )PESQ3.99Unverified
2ROSE-CDPESQ3.49Unverified
#ModelMetricClaimedVerifiedStatus
1Wave-U-NetCBAK3.24Unverified
#ModelMetricClaimedVerifiedStatus
1Audio-Visual concat-refPESQ2.7Unverified
#ModelMetricClaimedVerifiedStatus
1SE-MelGANAudio Quality MOS3.1Unverified
#ModelMetricClaimedVerifiedStatus
1DeFT-ANPESQ3.01Unverified
#ModelMetricClaimedVerifiedStatus
1Audio-Visual concat-refPESQ3.03Unverified
#ModelMetricClaimedVerifiedStatus
1SepFormerPESQ3.07Unverified