SOTAVerified

FSPEN: AN ULTRA-LIGHTWEIGHT NETWORK FOR REAL TIME SPEECH ENAHNCMENT

2024-04-15Conference 2024Code Available2· sign in to hype

Lei Yang1, Wei Liu1, Ruijie Meng1, Gunwoo Lee2, Soonho Baek2, Han-gil Moon2

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Deep learning-based speech enhancement methods have shown promising result in recent years. However, in practical applications, the model size and computational complexity are important factors that limit their use in end-products. Therefore, in products that require real-time speech enhancement with limited resources, such as TWS headsets, hearing aids, IoT devices, etc., ultra-lightweight models are necessary. In this paper, an ultra-lightweight network FSPEN is proposed for real-time speech enhancement task. We propose a full-band and sub-band network structure for extracting global and local features, and an inter-frame path extension method that can enhance network modeling capacity while preserving complexity. Experiments demonstrate that the proposed FSPEN achieves a performance of PESQ 2.97 on the VoiceBank+Demand dataset at 89M multiply-accumulate operation per second (MAC) and 79k parameters.

Tasks

Reproductions