Speaker Diarization
Speaker Diarization is the task of segmenting and co-indexing audio recordings by speaker. The way the task is commonly defined, the goal is not to identify known speakers, but to co-index segments that are attributed to the same speaker; in other words, diarization implies finding speaker boundaries and grouping segments that belong to the same speaker, and, as a by-product, determining the number of distinct speakers. In combination with speech recognition, diarization enables speaker-attributed speech-to-text transcription.
Source: Improving Diarization Robustness using Diversification, Randomization and the DOVER Algorithm
Papers
Showing 51–60 of 328 papers
All datasetsCALLHOMENIST-SRE 2000AMI LapelAMI MixHeadsetCH109DIHARDETAPEAMICALLHOME-109AliMeetingDIHARD IIHub5'00 CallHome
Benchmark Results
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | COS+NJW-SC (Oracle SAD) | DER(%) | 24.05 | — | Unverified |
| 2 | EEND | DER(%) | 23.07 | — | Unverified |
| 3 | COS+AHC (Oracle SAD) | DER(%) | 21.13 | — | Unverified |
| 4 | SA-EEND (2-spk, no-adapt) | DER(%) | 12.66 | — | Unverified |
| 5 | EEND-OLA | DER(%) | 12.57 | — | Unverified |
| 6 | SA-EEND (2-spk, adapted) | DER(%) | 10.76 | — | Unverified |
| 7 | TOLD | DER(%) | 10.14 | — | Unverified |
| 8 | COS+B-SC (Oracle SAD) | DER(ig olp) | 8.78 | — | Unverified |
| 9 | PLDA+AHC (Oracle SAD) | DER(ig olp) | 8.39 | — | Unverified |
| 10 | COS+NME-SC (Oracle SAD) | DER(ig olp) | 7.29 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | x-vector (PLDA + AHC) | DER(%) | 8.39 | — | Unverified |
| 2 | TitaNet-L (NME-SC) | DER(%) | 6.73 | — | Unverified |
| 3 | TitaNet-M (NME-SC) | DER(%) | 6.47 | — | Unverified |
| 4 | TitaNet-S (NME-SC) | DER(%) | 6.37 | — | Unverified |
| 5 | x-vector (MCGAN) | DER(%) | 5.73 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | ECAPA (SC) | DER(%) | 2.36 | — | Unverified |
| 2 | TitaNet-L (NME-SC) | DER(%) | 2.03 | — | Unverified |
| 3 | TitaNet-S (NME-SC) | DER(%) | 2 | — | Unverified |
| 4 | TitaNet-M (NME-SC) | DER(%) | 1.99 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | TitaNet-S (NME-SC) | DER(%) | 2.22 | — | Unverified |
| 2 | TitaNet-M (NME-SC) | DER(%) | 1.79 | — | Unverified |
| 3 | ECAPA (SC) | DER(%) | 1.78 | — | Unverified |
| 4 | TitaNet-L (NME-SC) | DER(%) | 1.73 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | x-vector (PLDA + AHC) | DER(%) | 9.72 | — | Unverified |
| 2 | TitaNet-L (NME-SC) | DER(%) | 1.19 | — | Unverified |
| 3 | TitaNet-M (NME-SC) | DER(%) | 1.13 | — | Unverified |
| 4 | TitaNet-S (NME-SC) | DER(%) | 1.11 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | Baseline (the best result in the literature as of Oct.2019) | DER(%) | 11.2 | — | Unverified |
| 2 | pyannote (MFCC) | DER(%) | 10.5 | — | Unverified |
| 3 | pyannote (waveform) | DER(%) | 9.9 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | Baseline | DER(%) | 7.7 | — | Unverified |
| 2 | pyannote (MFCC) | DER(%) | 5.6 | — | Unverified |
| 3 | pyannote (waveform) | DER(%) | 4.9 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | pyannote (MFCC) | DER(%) | 6.3 | — | Unverified |
| 2 | pyannote (waveform) | DER(%) | 6 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | d-vector + spectral | DER(%) | 12.54 | — | Unverified |
| 2 | titanet-s | DER(%) | 1.11 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | SOND | DER(%) | 4.46 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | UIS-RNN-SML | DER(%) | 27.3 | — | Unverified |
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | UIS-RNN | V | 10.6 | — | Unverified |