Speaker Diarization
Speaker Diarization is the task of segmenting and co-indexing audio recordings by speaker. The way the task is commonly defined, the goal is not to identify known speakers, but to co-index segments that are attributed to the same speaker; in other words, diarization implies finding speaker boundaries and grouping segments that belong to the same speaker, and, as a by-product, determining the number of distinct speakers. In combination with speech recognition, diarization enables speaker-attributed speech-to-text transcription.
Source: Improving Diarization Robustness using Diversification, Randomization and the DOVER Algorithm
Papers
Showing 1–10 of 328 papers
All datasetsCALLHOMENIST-SRE 2000AMI LapelAMI MixHeadsetCH109DIHARDETAPEAMICALLHOME-109AliMeetingDIHARD IIHub5'00 CallHome
Benchmark Results
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | Baseline | DER(%) | 7.7 | — | Unverified |
| 2 | pyannote (MFCC) | DER(%) | 5.6 | — | Unverified |
| 3 | pyannote (waveform) | DER(%) | 4.9 | — | Unverified |