The DKU-MSXF Diarization System for the VoxCeleb Speaker Recognition Challenge 2023

2023-08-15Unverified0· sign in to hype

Ming Cheng, Weiqing Wang, Xiaoyi Qin, Yuke Lin, Ning Jiang, Guoqing Zhao, Ming Li

Unverified — Be the first to reproduce this paper.

Abstract

This paper describes the DKU-MSXF submission to track 4 of the VoxCeleb Speaker Recognition Challenge 2023 (VoxSRC-23). Our system pipeline contains voice activity detection, clustering-based diarization, overlapped speech detection, and target-speaker voice activity detection, where each procedure has a fused output from 3 sub-models. Finally, we fuse different clustering-based and TSVAD-based diarization systems using DOVER-Lap and achieve the 4.30% diarization error rate (DER), which ranks first place on track 4 of the challenge leaderboard.

Tasks

Action Detection Activity Detection Clustering Speaker Recognition

The DKU-MSXF Diarization System for the VoxCeleb Speaker Recognition Challenge 2023

Abstract

Tasks

Reproductions