Speaker Recognition

Speaker Recognition is the process of identifying or confirming the identity of a person given his speech segments.

Source: Margin Matters: Towards More Discriminative Deep Neural Network Embeddings for Speaker Recognition

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–25 of 435 papers

Title	Date	Tasks	Status	Hype
PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit	May 20, 2022	AllAutomatic Speech Recognition (ASR)	CodeCode Available	6
VoxBlink2: A 100K+ Speaker Recognition Corpus and the Open-Set Speaker-Identification Benchmark	Jul 16, 2024	DiversitySpeaker Identification	CodeCode Available	5
ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models	Jan 30, 2024	Self-Supervised LearningSpeaker Recognition	CodeCode Available	3
Take the aTrain. Introducing an Interface for the Accessible Transcription of Interviews	Oct 18, 2023	CPUGPU	CodeCode Available	3
Leveraging In-the-Wild Data for Effective Self-Supervised Pretraining in Speaker Recognition	Sep 21, 2023	Speaker Recognition	CodeCode Available	3
Pushing the limits of raw waveform speaker recognition	Mar 16, 2022	Self-Supervised LearningSpeaker Recognition	CodeCode Available	3
SEED: Speaker Embedding Enhancement Diffusion Model	May 22, 2025	modelSpeaker Recognition	CodeCode Available	2
Reshape Dimensions Network for Speaker Recognition	Jul 25, 2024	Speaker Recognition	CodeCode Available	2
VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking	Oct 11, 2018	Speaker RecognitionSpeaker Separation	CodeCode Available	2
USEF-TSE: Universal Speaker Embedding Free Target Speaker Extraction	Sep 4, 2024	Speaker RecognitionSpeech Separation	CodeCode Available	1
VoxSim: A perceptual voice similarity dataset	Jul 26, 2024	BenchmarkingSpeaker Recognition	CodeCode Available	1
SpeechGLUE: How Well Can Self-Supervised Speech Models Capture Linguistic Knowledge?	Jun 14, 2023	Natural Language UnderstandingSelf-Supervised Learning	CodeCode Available	1
VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge	Feb 20, 2023	Speaker DiarizationSpeaker Recognition	CodeCode Available	1
Probabilistic Back-ends for Online Speaker Recognition and Clustering	Feb 19, 2023	ClusteringOnline Clustering	CodeCode Available	1
TAPLoss: A Temporal Acoustic Parameter Loss for Speech Enhancement	Feb 16, 2023	Speaker RecognitionSpeech Enhancement	CodeCode Available	1
OLKAVS: An Open Large-Scale Korean Audio-Visual Speech Dataset	Jan 16, 2023	Audio-Visual Speech RecognitionLip Reading	CodeCode Available	1
Speaker recognition with two-step multi-modal deep cleansing	Oct 28, 2022	Representation LearningSpeaker Recognition	CodeCode Available	1
Toroidal Probabilistic Spherical Discriminant Analysis	Oct 27, 2022	FormSpeaker Recognition	CodeCode Available	1
Towards Understanding and Mitigating Audio Adversarial Examples for Speaker Recognition	Jun 7, 2022	Speaker Recognitionspeech-recognition	CodeCode Available	1
Merkel Podcast Corpus: A Multimodal Dataset Compiled from 16 Years of Angela Merkel’s Weekly Video Podcasts	Jun 1, 2022	Face DetectionFace Generation	CodeCode Available	1
Merkel Podcast Corpus: A Multimodal Dataset Compiled from 16 Years of Angela Merkel's Weekly Video Podcasts	May 24, 2022	Face DetectionFace Generation	CodeCode Available	1
Speaker Recognition in the Wild	May 5, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	1
Probabilistic Spherical Discriminant Analysis: An Alternative to PLDA for length-normalized embeddings	Mar 28, 2022	Speaker Recognition	CodeCode Available	1
Training speaker recognition systems with limited data	Mar 28, 2022	Speaker Recognition	CodeCode Available	1
Bias in Automated Speaker Recognition	Jan 24, 2022	BIG-bench Machine LearningFace Recognition	CodeCode Available	1

Show:10 25 50

← PrevPage 1 of 18Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	w2v2-aam	EER	1.88	—	Unverified
2	WavLM+ECAPA-TDNN	EER	0.39	—	Unverified