Speaker Diarization
Speaker Diarization is the task of segmenting and co-indexing audio recordings by speaker. The way the task is commonly defined, the goal is not to identify known speakers, but to co-index segments that are attributed to the same speaker; in other words, diarization implies finding speaker boundaries and grouping segments that belong to the same speaker, and, as a by-product, determining the number of distinct speakers. In combination with speech recognition, diarization enables speaker-attributed speech-to-text transcription.
Source: Improving Diarization Robustness using Diversification, Randomization and the DOVER Algorithm
Papers
Showing 1–10 of 328 papers
All datasetsCALLHOMENIST-SRE 2000AMI LapelAMI MixHeadsetCH109DIHARDETAPEAMICALLHOME-109AliMeetingDIHARD IIHub5'00 CallHome
Benchmark Results
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | COS+NJW-SC (Oracle SAD) | DER(%) | 24.05 | — | Unverified |
| 2 | EEND | DER(%) | 23.07 | — | Unverified |
| 3 | COS+AHC (Oracle SAD) | DER(%) | 21.13 | — | Unverified |
| 4 | SA-EEND (2-spk, no-adapt) | DER(%) | 12.66 | — | Unverified |
| 5 | EEND-OLA | DER(%) | 12.57 | — | Unverified |
| 6 | SA-EEND (2-spk, adapted) | DER(%) | 10.76 | — | Unverified |
| 7 | TOLD | DER(%) | 10.14 | — | Unverified |
| 8 | COS+B-SC (Oracle SAD) | DER(ig olp) | 8.78 | — | Unverified |
| 9 | PLDA+AHC (Oracle SAD) | DER(ig olp) | 8.39 | — | Unverified |
| 10 | COS+NME-SC (Oracle SAD) | DER(ig olp) | 7.29 | — | Unverified |