DiaPer: End-to-End Neural Diarization with Perceiver-Based Attractors

2023-12-07Code Available1· sign in to hype

Federico Landini, Mireia Diez, Themos Stafylakis, Lukáš Burget

Code Available — Be the first to reproduce this paper.

Code

github.com/butspeechfit/diaper
OfficialIn paperpytorch★ 67

Abstract

Until recently, the field of speaker diarization was dominated by cascaded systems. Due to their limitations, mainly regarding overlapped speech and cumbersome pipelines, end-to-end models have gained great popularity lately. One of the most successful models is end-to-end neural diarization with encoder-decoder based attractors (EEND-EDA). In this work, we replace the EDA module with a Perceiver-based one and show its advantages over EEND-EDA; namely obtaining better performance on the largely studied Callhome dataset, finding the quantity of speakers in a conversation more accurately, and faster inference time. Furthermore, when exhaustively compared with other methods, our model, DiaPer, reaches remarkable performance with a very lightweight design. Besides, we perform comparisons with other works and a cascaded baseline across more than ten public wide-band datasets. Together with this publication, we release the code of DiaPer as well as models trained on public and free data.

Tasks

Decoder speaker-diarization Speaker Diarization

DiaPer: End-to-End Neural Diarization with Perceiver-Based Attractors

Code

Abstract

Tasks

Reproductions