SOTAVerified

Speech Enhancement

Speech Enhancement is a signal processing task that involves improving the quality of speech signals captured under noisy or degraded conditions. The goal of speech enhancement is to make speech signals clearer, more intelligible, and more pleasant to listen to, which can be used for various applications such as voice recognition, teleconferencing, and hearing aids. A representative Github project with online demo : ClearerVoice-Studio.

( Image credit: A Fully Convolutional Neural Network For Speech Enhancement )

Papers

Showing 751800 of 982 papers

TitleStatusHype
Time-Domain Multi-modal Bone/air Conducted Speech Enhancement0
Time-Domain Speech Enhancement for Robust Automatic Speech Recognition0
Time-Variance Aware Real-Time Speech Enhancement0
To Dereverb Or Not to Dereverb? Perceptual Studies On Real-Time Dereverberation Targets0
TouchTTS: An Embarrassingly Simple TTS Framework that Everyone Can Touch0
Towards Advanced Speech Signal Processing: A Statistical Perspective on Convolution-Based Architectures and its Applications0
Towards efficient models for real-time deep noise suppression0
Towards Generalized Speech Enhancement with Generative Adversarial Networks0
Towards Low-distortion Multi-channel Speech Enhancement: The ESPNet-SE Submission to The L3DAS22 Challenge0
Towards Robust Real-time Audio-Visual Speech Enhancement0
Towards Robust Speaker Verification with Target Speaker Enhancement0
Towards speech enhancement using a variational U-Net architecture0
Towards Sub-millisecond Latency Real-Time Speech Enhancement Models on Hearables0
Toward Universal Speech Enhancement for Diverse Input Conditions0
Trainable Adaptive Window Switching for Speech Enhancement0
Training Speech Enhancement Systems with Noisy Speech Datasets0
TRAMBA: A Hybrid Transformer and Mamba Architecture for Practical Audio and Bone Conduction Speech Super Resolution and Enhancement on Mobile and Wearable Platforms0
Transformers in Speech Processing: A Survey0
Transformers with Competitive Ensembles of Independent Mechanisms0
Translation-Invariant Shrinkage/Thresholding of Group Sparse Signals0
TridentSE: Guiding Speech Enhancement with 32 Global Tokens0
TRNet: Two-level Refinement Network leveraging Speech Enhancement for Noise Robust Speech Emotion Recognition0
TSTNN: Two-stage Transformer based Neural Network for Speech Enhancement in the Time Domain0
TS-URGENet: A Three-stage Universal Robust and Generalizable Speech Enhancement Network0
Convolutional Recurrent Neural Network with Attention for 3D Speech Enhancement0
Two-Step Knowledge Distillation for Tiny Speech Enhancement0
Ultra-Lightweight Speech Separation via Group Communication0
Ultra Low Complexity Deep Learning Based Noise Suppression0
Ultra-Low Latency Speech Enhancement - A Comprehensive Study0
Uncertainty Estimation in Deep Speech Enhancement Using Complex Gaussian Mixture Models0
UNetGAN: A Robust Speech Enhancement Approach in Time Domain for Extremely Low Signal-to-noise Ratio Condition0
Unified Architecture and Unsupervised Speech Disentanglement for Speaker Embedding-Free Enrollment in Personalized Speech Enhancement0
Unifying Robustness and Fidelity: A Comprehensive Study of Pretrained Generative Methods for Speech Enhancement in Adverse Conditions0
Unrestricted Global Phase Bias-Aware Single-channel Speech Enhancement with Conformer-based Metric GAN0
Unsupervised Face-Masked Speech Enhancement Using Generative Adversarial Networks With Human-in-the-Loop Assessment Metrics0
Unsupervised Noise adaptation using Data Simulation0
Unsupervised Sound Separation Using Mixture Invariant Training0
Unsupervised Speech Enhancement Based on Multichannel NMF-Informed Beamforming for Noise-Robust Automatic Speech Recognition0
Unsupervised speech enhancement with deep dynamical generative speech and noise models0
Unsupervised Speech Enhancement with speech recognition embedding and disentanglement losses0
UP-Cycle-SENet: Unpaired Phase-aware Speech Enhancement Using Deep Complex Cycle Adversarial Networks0
URGENT Challenge: Universality, Robustness, and Generalizability For Speech Enhancement0
uSee: Unified Speech Enhancement and Editing with Conditional Diffusion Models0
Using recurrences in time and frequency within U-net architecture for speech enhancement0
Using RLHF to align speech enhancement approaches to mean-opinion quality scores0
Utterance-Wise Meeting Transcription System Using Asynchronous Distributed Microphones0
Variational Autoencoder for Personalized Pathological Speech Enhancement0
Variational Autoencoder for Speech Enhancement with a Noise-Aware Encoder0
Visual Speech Enhancement0
Voice Activity Detection using Temporal Characteristics of Autocorrelation Lag and Maximum Spectral Amplitude in Sub-bands0
Show:102550
← PrevPage 16 of 20Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ROSE-CD(PESQ)PESQ (wb)3.99Unverified
2PESQetarianPESQ (wb)3.82Unverified
3Mamba-SEUNet L (+PCS)PESQ (wb)3.73Unverified
4Schrödinger bridge (PESQ loss)PESQ (wb)3.7Unverified
5SEMamba (+PCS)PESQ (wb)3.69Unverified
6ZipEnhancer (S, \lamba_6 = 0)PESQ (wb)3.63Unverified
7PrimeK-NetPESQ (wb)3.61Unverified
8ZipEnhancer (S, \lamba_6 = 0.2)PESQ (wb)3.61Unverified
9MP-SENetPESQ (wb)3.6Unverified
10PCS_CS_WAVLMPESQ (wb)3.54Unverified
#ModelMetricClaimedVerifiedStatus
1BSRNN-S + MGDSI-SDR-WB21.4Unverified
2DTLNSI-SDR-WB16.34Unverified
3Non-Real-Time MultiScale+SI-SDR-WB16.22Unverified
4ZipEnhancer (M)PESQ-WB3.81Unverified
5TF-Locoformer (M)PESQ-WB3.72Unverified
6ZipEnhancer (S)PESQ-WB3.69Unverified
7MambAttentionPESQ-WB3.67Unverified
8MP-SENetPESQ-WB3.62Unverified
9xLSTM-SENetPESQ-WB3.59Unverified
10BSRNN-S + MRSDPESQ-WB3.53Unverified
#ModelMetricClaimedVerifiedStatus
1Inter-Channel Conv-TasNetSDR19.67Unverified
2CA Dense U-Net (Complex)SDR18.64Unverified
3Dense U-Net (Complex)SDR18.4Unverified
4Dense U-Net (Real)SDR16.86Unverified
5U-Net (Real)SDR15.97Unverified
6Noisy/unprocessedSDR6.5Unverified
#ModelMetricClaimedVerifiedStatus
1Schrödinger Bridge (PESQ loss)PESQ-WB3.09Unverified
2SGMSE+PESQ-WB2.5Unverified
3Demucs v4PESQ-WB2.37Unverified
4Schrödinger BridgePESQ-WB2.33Unverified
5Conv-TasNetPESQ-WB2.31Unverified
6CDiffuSEPESQ-WB1.6Unverified
#ModelMetricClaimedVerifiedStatus
1ReVISE (ch2)Audio Quality MOS4.19Unverified
2ReVISE (bf)Audio Quality MOS4.11Unverified
3Demucs (ch2)Audio Quality MOS2.95Unverified
4Demucs (bf)Audio Quality MOS2.39Unverified
5MaxDI (Baseline)PESQ1.17Unverified
6DAJA (MVDR,HMA,1000) (Overlapped Speech)SDR-4.76Unverified
#ModelMetricClaimedVerifiedStatus
1ZipEnhancer (M)PESQ-NB4.08Unverified
2DCCRN-MCPESQ-NB3.21Unverified
3DCCRN-MPESQ-NB3.15Unverified
4DCCRNPESQ-NB3.04Unverified
5RNN-ModulationPESQ-WB2.75Unverified
#ModelMetricClaimedVerifiedStatus
1MambAttentionESTOI0.8Unverified
2SEMambaESTOI0.8Unverified
3xLSTM-SENetESTOI0.8Unverified
4MP-SENetESTOI0.79Unverified
#ModelMetricClaimedVerifiedStatus
1SepFormerPESQ2.84Unverified
2DTLNPESQ2.23Unverified
3UnprocessedPESQ1.83Unverified
4Non-Real-Time MultiScale+PESQ1.52Unverified
#ModelMetricClaimedVerifiedStatus
1DCUNet-MCPESQ-NB3.44Unverified
2DCCRN-MPESQ-NB3.28Unverified
3DCUNetPESQ-NB3.25Unverified
#ModelMetricClaimedVerifiedStatus
1CleanMel-L-mapDNSMOS3.82Unverified
2SpatialNetDNSMOS BAK3.43Unverified
#ModelMetricClaimedVerifiedStatus
1rose_cd(PESQ )PESQ3.99Unverified
2ROSE-CDPESQ3.49Unverified
#ModelMetricClaimedVerifiedStatus
1Wave-U-NetCBAK3.24Unverified
#ModelMetricClaimedVerifiedStatus
1Audio-Visual concat-refPESQ2.7Unverified
#ModelMetricClaimedVerifiedStatus
1SE-MelGANAudio Quality MOS3.1Unverified
#ModelMetricClaimedVerifiedStatus
1DeFT-ANPESQ3.01Unverified
#ModelMetricClaimedVerifiedStatus
1Audio-Visual concat-refPESQ3.03Unverified
#ModelMetricClaimedVerifiedStatus
1SepFormerPESQ3.07Unverified