SOTAVerified

Speech Enhancement

Speech Enhancement is a signal processing task that involves improving the quality of speech signals captured under noisy or degraded conditions. The goal of speech enhancement is to make speech signals clearer, more intelligible, and more pleasant to listen to, which can be used for various applications such as voice recognition, teleconferencing, and hearing aids. A representative Github project with online demo : ClearerVoice-Studio.

( Image credit: A Fully Convolutional Neural Network For Speech Enhancement )

Papers

Showing 151200 of 982 papers

TitleStatusHype
Schrödinger Bridge for Generative Speech Enhancement0
Wideband Relative Transfer Function (RTF) Estimation Exploiting Frequency CorrelationsCode0
Vibravox: A Dataset of French Speech Captured with Body-conduction Audio SensorsCode1
RT-LA-VocE: Real-Time Low-SNR Audio-Visual Speech Enhancement0
Unsupervised Face-Masked Speech Enhancement Using Generative Adversarial Networks With Human-in-the-Loop Assessment Metrics0
Open-Source Conversational AI with SpeechBrain 1.00
Exploiting Foundation Models and Speech Enhancement for Parkinson's Disease Detection from Speech in Real-World Operative ConditionsCode1
DASB -- Discrete Audio and Speech Benchmark0
Diffusion-based Generative Modeling with Discriminative Guidance for Streamable Speech Enhancement0
Universal Score-based Speech Enhancement with High Content PreservationCode2
Spatially constrained vs. unconstrained filtering in neural spatiospectral filters for multichannel speech enhancement0
An Exploration of Length Generalization in Transformer-Based Speech Enhancement0
AV-CrossNet: an Audiovisual Complex Spectral Mapping Network for Speech Separation By Leveraging Narrow- and Cross-Band ModelingCode1
Personalized Speech Enhancement Without a Separate Speaker Embedding Model0
FlowAVSE: Efficient Audio-Visual Speech Enhancement with Conditional Flow Matching0
Comparative Analysis of Personalized Voice Activity Detection Systems: Assessing Real-World Effectiveness0
Pre-training Feature Guided Diffusion Model for Speech Enhancement0
The Effect of Training Dataset Size on Discriminative and Diffusion-Based Speech Enhancement Systems0
EARS: An Anechoic Fullband Speech Dataset Benchmarked for Speech Enhancement and DereverberationCode3
Thunder : Unified Regression-Diffusion Speech Enhancement with a Single Reverse Step using Brownian Bridge0
An Investigation of Noise Robustness for Flow-Matching-Based Zero-Shot TTS0
URGENT Challenge: Universality, Robustness, and Generalizability For Speech Enhancement0
Flexible Multichannel Speech Enhancement for Noise-Robust Frontend0
Helsinki Speech Challenge 20240
Beyond Performance Plateaus: A Comprehensive Study on Scalability in Speech EnhancementCode1
PLDNet: PLD-Guided Lightweight Deep Network Boosted by Efficient Attention for Handheld Dual-Microphone Speech Enhancement0
Once more Diarization: Improving meeting transcription systems through segment-level speaker reassignmentCode1
Reference Channel Selection by Multi-Channel Masking for End-to-End Multi-Channel Speech Enhancement0
The PESQetarian: On the Relevance of Goodhart's Law for Speech Enhancement0
Speech enhancement deep-learning architecture for efficient edge processing0
A Variance-Preserving Interpolation Approach for Diffusion Models with Applications to Single Channel Speech Enhancement and RecognitionCode1
Non-autoregressive real-time Accent Conversion model with voice cloning0
Mamba in Speech: Towards an Alternative to Self-AttentionCode2
Monaural speech enhancement on drone via Adapter based transfer learning0
Building a Luganda Text-to-Speech Model From Crowdsourced Data0
Evaluating Speech Enhancement Systems Through Listening Effort0
An Investigation of Incorporating Mamba for Speech EnhancementCode3
Real-time multichannel deep speech enhancement in hearing aids: Comparing monaural and binaural processing in complex acoustic scenarios0
TRAMBA: A Hybrid Transformer and Mamba Architecture for Practical Audio and Bone Conduction Speech Super Resolution and Enhancement on Mobile and Wearable Platforms0
Deep low-latency joint speech transmission and enhancement over a gaussian channel0
Rethinking Processing Distortions: Disentangling the Impact of Speech Enhancement Errors on Speech Recognition Performance0
Exploring the Potential of Data-Driven Spatial Audio Enhancement Using a Single-Channel Model0
TRNet: Two-level Refinement Network leveraging Speech Enhancement for Noise Robust Speech Emotion Recognition0
FSPEN: AN ULTRA-LIGHTWEIGHT NETWORK FOR REAL TIME SPEECH ENAHNCMENTCode2
Efficient High-Performance Bark-Scale Neural Network for Residual Echo and Noise Suppression0
Artificial Intelligence for Cochlear Implants: Review of Strategies, Challenges, and Perspectives0
SuperM2M: Supervised and Mixture-to-Mixture Co-Learning for Speech Enhancement and Noise-Robust ASR0
How to train your ears: Auditory-model emulation for large-dynamic-range inputs and mild-to-severe hearing lossesCode0
Binaural Speech Enhancement Using Deep Complex Convolutional Transformer NetworksCode1
A Closer Look at Wav2Vec2 Embeddings for On-Device Single-Channel Speech Enhancement0
Show:102550
← PrevPage 4 of 20Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ROSE-CD(PESQ)PESQ (wb)3.99Unverified
2PESQetarianPESQ (wb)3.82Unverified
3Mamba-SEUNet L (+PCS)PESQ (wb)3.73Unverified
4Schrödinger bridge (PESQ loss)PESQ (wb)3.7Unverified
5SEMamba (+PCS)PESQ (wb)3.69Unverified
6ZipEnhancer (S, \lamba_6 = 0)PESQ (wb)3.63Unverified
7PrimeK-NetPESQ (wb)3.61Unverified
8ZipEnhancer (S, \lamba_6 = 0.2)PESQ (wb)3.61Unverified
9MP-SENetPESQ (wb)3.6Unverified
10PCS_CS_WAVLMPESQ (wb)3.54Unverified
#ModelMetricClaimedVerifiedStatus
1BSRNN-S + MGDSI-SDR-WB21.4Unverified
2DTLNSI-SDR-WB16.34Unverified
3Non-Real-Time MultiScale+SI-SDR-WB16.22Unverified
4ZipEnhancer (M)PESQ-WB3.81Unverified
5TF-Locoformer (M)PESQ-WB3.72Unverified
6ZipEnhancer (S)PESQ-WB3.69Unverified
7MambAttentionPESQ-WB3.67Unverified
8MP-SENetPESQ-WB3.62Unverified
9xLSTM-SENetPESQ-WB3.59Unverified
10BSRNN-S + MRSDPESQ-WB3.53Unverified
#ModelMetricClaimedVerifiedStatus
1Inter-Channel Conv-TasNetSDR19.67Unverified
2CA Dense U-Net (Complex)SDR18.64Unverified
3Dense U-Net (Complex)SDR18.4Unverified
4Dense U-Net (Real)SDR16.86Unverified
5U-Net (Real)SDR15.97Unverified
6Noisy/unprocessedSDR6.5Unverified
#ModelMetricClaimedVerifiedStatus
1Schrödinger Bridge (PESQ loss)PESQ-WB3.09Unverified
2SGMSE+PESQ-WB2.5Unverified
3Demucs v4PESQ-WB2.37Unverified
4Schrödinger BridgePESQ-WB2.33Unverified
5Conv-TasNetPESQ-WB2.31Unverified
6CDiffuSEPESQ-WB1.6Unverified
#ModelMetricClaimedVerifiedStatus
1ReVISE (ch2)Audio Quality MOS4.19Unverified
2ReVISE (bf)Audio Quality MOS4.11Unverified
3Demucs (ch2)Audio Quality MOS2.95Unverified
4Demucs (bf)Audio Quality MOS2.39Unverified
5MaxDI (Baseline)PESQ1.17Unverified
6DAJA (MVDR,HMA,1000) (Overlapped Speech)SDR-4.76Unverified
#ModelMetricClaimedVerifiedStatus
1ZipEnhancer (M)PESQ-NB4.08Unverified
2DCCRN-MCPESQ-NB3.21Unverified
3DCCRN-MPESQ-NB3.15Unverified
4DCCRNPESQ-NB3.04Unverified
5RNN-ModulationPESQ-WB2.75Unverified
#ModelMetricClaimedVerifiedStatus
1MambAttentionESTOI0.8Unverified
2SEMambaESTOI0.8Unverified
3xLSTM-SENetESTOI0.8Unverified
4MP-SENetESTOI0.79Unverified
#ModelMetricClaimedVerifiedStatus
1SepFormerPESQ2.84Unverified
2DTLNPESQ2.23Unverified
3UnprocessedPESQ1.83Unverified
4Non-Real-Time MultiScale+PESQ1.52Unverified
#ModelMetricClaimedVerifiedStatus
1DCUNet-MCPESQ-NB3.44Unverified
2DCCRN-MPESQ-NB3.28Unverified
3DCUNetPESQ-NB3.25Unverified
#ModelMetricClaimedVerifiedStatus
1CleanMel-L-mapDNSMOS3.82Unverified
2SpatialNetDNSMOS BAK3.43Unverified
#ModelMetricClaimedVerifiedStatus
1rose_cd(PESQ )PESQ3.99Unverified
2ROSE-CDPESQ3.49Unverified
#ModelMetricClaimedVerifiedStatus
1Wave-U-NetCBAK3.24Unverified
#ModelMetricClaimedVerifiedStatus
1Audio-Visual concat-refPESQ2.7Unverified
#ModelMetricClaimedVerifiedStatus
1SE-MelGANAudio Quality MOS3.1Unverified
#ModelMetricClaimedVerifiedStatus
1DeFT-ANPESQ3.01Unverified
#ModelMetricClaimedVerifiedStatus
1Audio-Visual concat-refPESQ3.03Unverified
#ModelMetricClaimedVerifiedStatus
1SepFormerPESQ3.07Unverified