Speaker Diarization

Speaker Diarization is the task of segmenting and co-indexing audio recordings by speaker. The way the task is commonly defined, the goal is not to identify known speakers, but to co-index segments that are attributed to the same speaker; in other words, diarization implies finding speaker boundaries and grouping segments that belong to the same speaker, and, as a by-product, determining the number of distinct speakers. In combination with speech recognition, diarization enables speaker-attributed speech-to-text transcription.

Source: Improving Diarization Robustness using Diversification, Randomization and the DOVER Algorithm

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 226–250 of 328 papers

Title	Date	Tasks	Status
A Real-time Speaker Diarization System Based on Spatial Spectrum	Jul 20, 2021	speaker-diarizationSpeaker Diarization	—Unverified
A Comparative Study of Modular and Joint Approaches for Speaker-Attributed ASR on Monaural Long-Form Audio	Jul 6, 2021	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Separation Guided Speaker Diarization in Realistic Mismatched Conditions	Jul 6, 2021	Clusteringspeaker-diarization	—Unverified
Development of a Conversation State Prediction System	Jul 3, 2021	Predictionspeaker-diarization	—Unverified
Speaker-conversation factorial designs for diarization error analysis	Jun 10, 2021	Clusteringspeaker-diarization	—Unverified
End-to-End Speaker Diarization Conditioned on Speech Activity and Overlap Detection	Jun 8, 2021	Clusteringspeaker-diarization	—Unverified
DIVE: End-to-end Speech Diarization via Iterative Speaker Embedding	May 28, 2021	speaker-diarizationSpeaker Diarization	—Unverified
X-Vectors with Multi-Scale Aggregation for Speaker Diarization	May 16, 2021	speaker-diarizationSpeaker Diarization	—Unverified
Self-supervised Representation Learning With Path Integral Clustering For Speaker Diarization	Apr 19, 2021	ClusteringRepresentation Learning	CodeCode Available
Three-class Overlapped Speech Detection using a Convolutional Recurrent Neural Network	Apr 7, 2021	Binary Classificationspeaker-diarization	—Unverified
LEAP Submission for the Third DIHARD Diarization Challenge	Apr 6, 2021	Clusteringspeaker-diarization	—Unverified
Speaker Diarization using Two-pass Leave-One-Out Gaussian PLDA Clustering of DNN Embeddings	Apr 6, 2021	Clusteringspeaker-diarization	CodeCode Available
Speaker conditioned acoustic modeling for multi-speaker conversational ASR	Apr 5, 2021	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
ECAPA-TDNN Embeddings for Speaker Diarization	Apr 3, 2021	speaker-diarizationSpeaker Diarization	—Unverified
Data Fusion for Audiovisual Speaker Localization: Extending Dynamic Stream Weights to the Spatial Domain	Feb 23, 2021	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available
Domain-Dependent Speaker Diarization for the Third DIHARD Challenge	Jan 25, 2021	ClusteringDimensionality Reduction	—Unverified
A Review of Speaker Diarization: Recent Advances with Deep Learning	Jan 24, 2021	Deep LearningRetrieval	—Unverified
End-to-End Speaker Diarization as Post-Processing	Dec 18, 2020	ClusteringMulti-Label Classification	—Unverified
Speaker Recognition Based on Deep Learning: An Overview	Dec 2, 2020	Deep LearningDomain Adaptation	—Unverified
A Comprehensive Evaluation of Incremental Speech Recognition and Diarization for Conversational AI	Dec 1, 2020	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available
VOXLINGUA107: A DATASET FOR SPOKEN LANGUAGE RECOGNITION	Nov 25, 2020	Action DetectionActivity Detection	—Unverified
BW-EDA-EEND: Streaming End-to-End Neural Speaker Diarization for a Variable Number of Speakers	Nov 5, 2020	ClusteringDecoder	—Unverified
Integration of speech separation, diarization, and recognition for multi-speaker meetings: System description, comparison, and analysis	Nov 3, 2020	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
Third DIHARD Challenge Evaluation Plan	Oct 30, 2020	speaker-diarizationSpeaker Diarization	—Unverified
EML System Description for VoxCeleb Speaker Diarization Challenge 2020	Oct 23, 2020	CPUspeaker-diarization	—Unverified

Show:10 25 50

← PrevPage 10 of 14Next →

All datasets CALLHOME NIST-SRE 2000 AMI Lapel AMI MixHeadset CH109 DIHARD ETAPE AMI CALLHOME-109 AliMeeting DIHARD II Hub5'00 CallHome

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	COS+NJW-SC (Oracle SAD)	DER(%)	24.05	—	Unverified
2	EEND	DER(%)	23.07	—	Unverified
3	COS+AHC (Oracle SAD)	DER(%)	21.13	—	Unverified
4	SA-EEND (2-spk, no-adapt)	DER(%)	12.66	—	Unverified
5	EEND-OLA	DER(%)	12.57	—	Unverified
6	SA-EEND (2-spk, adapted)	DER(%)	10.76	—	Unverified
7	TOLD	DER(%)	10.14	—	Unverified
8	COS+B-SC (Oracle SAD)	DER(ig olp)	8.78	—	Unverified
9	PLDA+AHC (Oracle SAD)	DER(ig olp)	8.39	—	Unverified
10	COS+NME-SC (Oracle SAD)	DER(ig olp)	7.29	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	x-vector (PLDA + AHC)	DER(%)	8.39	—	Unverified
2	TitaNet-L (NME-SC)	DER(%)	6.73	—	Unverified
3	TitaNet-M (NME-SC)	DER(%)	6.47	—	Unverified
4	TitaNet-S (NME-SC)	DER(%)	6.37	—	Unverified
5	x-vector (MCGAN)	DER(%)	5.73	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ECAPA (SC)	DER(%)	2.36	—	Unverified
2	TitaNet-L (NME-SC)	DER(%)	2.03	—	Unverified
3	TitaNet-S (NME-SC)	DER(%)	2	—	Unverified
4	TitaNet-M (NME-SC)	DER(%)	1.99	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	TitaNet-S (NME-SC)	DER(%)	2.22	—	Unverified
2	TitaNet-M (NME-SC)	DER(%)	1.79	—	Unverified
3	ECAPA (SC)	DER(%)	1.78	—	Unverified
4	TitaNet-L (NME-SC)	DER(%)	1.73	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	x-vector (PLDA + AHC)	DER(%)	9.72	—	Unverified
2	TitaNet-L (NME-SC)	DER(%)	1.19	—	Unverified
3	TitaNet-M (NME-SC)	DER(%)	1.13	—	Unverified
4	TitaNet-S (NME-SC)	DER(%)	1.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Baseline (the best result in the literature as of Oct.2019)	DER(%)	11.2	—	Unverified
2	pyannote (MFCC)	DER(%)	10.5	—	Unverified
3	pyannote (waveform)	DER(%)	9.9	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Baseline	DER(%)	7.7	—	Unverified
2	pyannote (MFCC)	DER(%)	5.6	—	Unverified
3	pyannote (waveform)	DER(%)	4.9	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	pyannote (MFCC)	DER(%)	6.3	—	Unverified
2	pyannote (waveform)	DER(%)	6	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	d-vector + spectral	DER(%)	12.54	—	Unverified
2	titanet-s	DER(%)	1.11	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	SOND	DER(%)	4.46	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	UIS-RNN-SML	DER(%)	27.3	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	UIS-RNN	V	10.6	—	Unverified