An Investigation of Incorporating Mamba for Speech Enhancement

2024-05-10Code Available3· sign in to hype

Rong Chao, Wen-Huang Cheng, Moreno La Quatra, Sabato Marco Siniscalchi, Chao-Han Huck Yang, Szu-Wei Fu, Yu Tsao

Code Available — Be the first to reproduce this paper.

Code

github.com/roychao19477/semamba
OfficialIn paperpytorch★ 256

Abstract

This work aims to study a scalable state-space model (SSM), Mamba, for the speech enhancement (SE) task. We exploit a Mamba-based regression model to characterize speech signals and build an SE system upon Mamba, termed SEMamba. We explore the properties of Mamba by integrating it as the core model in both basic and advanced SE systems, along with utilizing signal-level distances as well as metric-oriented loss functions. SEMamba demonstrates promising results and attains a PESQ score of 3.55 on the VoiceBank-DEMAND dataset. When combined with the perceptual contrast stretching technique, the proposed SEMamba yields a new state-of-the-art PESQ score of 3.69.

Tasks

Mamba Speech Enhancement

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
VoiceBank + DEMAND	SEMamba (+PCS)	PESQ (wb)	3.69	—	Unverified

An Investigation of Incorporating Mamba for Speech Enhancement

Code

Abstract

Tasks

Benchmark Results

Reproductions