Speaker Recognition

Speaker Recognition is the process of identifying or confirming the identity of a person given his speech segments.

Source: Margin Matters: Towards More Discriminative Deep Neural Network Embeddings for Speaker Recognition

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–25 of 435 papers

Title	Date	Tasks	Status	Hype	Score
PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit	May 20, 2022	AllAutomatic Speech Recognition (ASR)	CodeCode Available	6	5
VoxBlink2: A 100K+ Speaker Recognition Corpus and the Open-Set Speaker-Identification Benchmark	Jul 16, 2024	DiversitySpeaker Identification	CodeCode Available	5	5
Take the aTrain. Introducing an Interface for the Accessible Transcription of Interviews	Oct 18, 2023	CPUGPU	CodeCode Available	3	5
ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models	Jan 30, 2024	Self-Supervised LearningSpeaker Recognition	CodeCode Available	3	5
Leveraging In-the-Wild Data for Effective Self-Supervised Pretraining in Speaker Recognition	Sep 21, 2023	Speaker Recognition	CodeCode Available	3	5
Pushing the limits of raw waveform speaker recognition	Mar 16, 2022	Self-Supervised LearningSpeaker Recognition	CodeCode Available	3	5
VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking	Oct 11, 2018	Speaker RecognitionSpeaker Separation	CodeCode Available	2	5
SEED: Speaker Embedding Enhancement Diffusion Model	May 22, 2025	modelSpeaker Recognition	CodeCode Available	2	5
Reshape Dimensions Network for Speaker Recognition	Jul 25, 2024	Speaker Recognition	CodeCode Available	2	5
Probabilistic Spherical Discriminant Analysis: An Alternative to PLDA for length-normalized embeddings	Mar 28, 2022	Speaker Recognition	CodeCode Available	1	5
NPLDA: A Deep Neural PLDA Model for Speaker Verification	Feb 10, 2020	Speaker RecognitionSpeaker Verification	CodeCode Available	1	5
Meta-Learning for Short Utterance Speaker Recognition with Imbalance Length Pairs	Apr 6, 2020	Meta-LearningSpeaker Identification	CodeCode Available	1	5
Neural PLDA Modeling for End-to-End Speaker Verification	Aug 11, 2020	Speaker RecognitionSpeaker Verification	CodeCode Available	1	5
Probabilistic Back-ends for Online Speaker Recognition and Clustering	Feb 19, 2023	ClusteringOnline Clustering	CodeCode Available	1	5
OLKAVS: An Open Large-Scale Korean Audio-Visual Speech Dataset	Jan 16, 2023	Audio-Visual Speech RecognitionLip Reading	CodeCode Available	1	5
Leveraging speaker attribute information using multi task learning for speaker verification and diarization	Oct 27, 2020	AttributeMulti-Task Learning	CodeCode Available	1	5
Bias in Automated Speaker Recognition	Jan 24, 2022	BIG-bench Machine LearningFace Recognition	CodeCode Available	1	5
Frame-level speaker embeddings for text-independent speaker recognition and analysis of end-to-end model	Sep 12, 2018	Speaker RecognitionText-Independent Speaker Recognition	CodeCode Available	1	5
Merkel Podcast Corpus: A Multimodal Dataset Compiled from 16 Years of Angela Merkel's Weekly Video Podcasts	May 24, 2022	Face DetectionFace Generation	CodeCode Available	1	5
Fine-tuning wav2vec2 for speaker recognition	Sep 30, 2021	ClassificationSpeaker Recognition	CodeCode Available	1	5
HLT-NUS SUBMISSION FOR 2020 NIST Conversational Telephone Speech SRE	Nov 12, 2021	Domain AdaptationSpeaker Recognition	CodeCode Available	1	5
AutoSpeech: Neural Architecture Search for Speaker Recognition	May 7, 2020	image-classificationImage Classification	CodeCode Available	1	5
Adversarial Attack and Defense Strategies for Deep Speaker Recognition Systems	Aug 18, 2020	Adversarial AttackAdversarial Robustness	CodeCode Available	1	5
Crossed-Time Delay Neural Network for Speaker Recognition	May 31, 2020	Speaker RecognitionSpeaker Verification	CodeCode Available	1	5
BERTphone: Phonetically-Aware Encoder Representations for Utterance-Level Speaker and Language Recognition	Jun 30, 2019	AvgRepresentation Learning	CodeCode Available	1	5

Show:10 25 50

← PrevPage 1 of 18Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	w2v2-aam	EER	1.88	—	Unverified
2	WavLM+ECAPA-TDNN	EER	0.39	—	Unverified