Continuous Speech Separation with Conformer

2020-08-13Code Available1· sign in to hype

Sanyuan Chen, Yu Wu, Zhuo Chen, Jian Wu, Jinyu Li, Takuya Yoshioka, Chengyi Wang, Shujie Liu, Ming Zhou

Code Available — Be the first to reproduce this paper.

Code

github.com/Sanyuan-Chen/CSS_with_Conformer
pytorch★ 120

Abstract

Continuous speech separation plays a vital role in complicated speech related tasks such as conversation transcription. The separation model extracts a single speaker signal from a mixed speech. In this paper, we use transformer and conformer in lieu of recurrent neural networks in the separation system, as we believe capturing global information with the self-attention based method is crucial for the speech separation. Evaluating on the LibriCSS dataset, the conformer separation model achieves state of the art results, with a relative 23.5% word error rate (WER) reduction from bi-directional LSTM (BLSTM) in the utterance-wise evaluation and a 15.4% WER reduction in the continuous evaluation.

Tasks

Speech Separation

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
LibriCSS	Conformer (large)	0S	5.4	—	Unverified
LibriCSS	Conformer (base)	0S	5.6	—	Unverified

Continuous Speech Separation with Conformer

Code

Abstract

Tasks

Benchmark Results

Reproductions