Speaker Recognition

Speaker Recognition is the process of identifying or confirming the identity of a person given his speech segments.

Source: Margin Matters: Towards More Discriminative Deep Neural Network Embeddings for Speaker Recognition

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–50 of 435 papers

Title	Date	Tasks	Status	Hype
An Exploration of ECAPA-TDNN and x-vector Speaker Representations in Zero-shot Multi-speaker TTS	Jun 25, 2025	Speaker Recognitiontext-to-speech	—Unverified	0
A Comparative Evaluation of Deep Learning Models for Speech Enhancement in Real-World Noisy Environments	Jun 17, 2025	DenoisingSpeaker Recognition	—Unverified	0
CoLMbo: Speaker Language Model for Descriptive Profiling	Jun 11, 2025	DescriptiveLanguage Modeling	CodeCode Available	0
Learning Speaker-Invariant Visual Features for Lipreading	Jun 9, 2025	DisentanglementLipreading	—Unverified	0
Rhythm Features for Speaker Identification	Jun 7, 2025	Deep LearningRhythm	—Unverified	0
Synthetic Speech Source Tracing using Metric Learning	Jun 3, 2025	Metric LearningSelf-Supervised Learning	—Unverified	0
LASPA: Language Agnostic Speaker Disentanglement with Prefix-Tuned Cross-Attention	Jun 2, 2025	AnatomyDisentanglement	—Unverified	0
Investigating the Reasonable Effectiveness of Speaker Pre-Trained Models and their Synergistic Power for SingMOS Prediction	Jun 2, 2025	Speaker Recognition	—Unverified	0
Source Tracing of Synthetic Speech Systems Through Paralinguistic Pre-Trained Representations	Jun 1, 2025	Emotion RecognitionRhythm	—Unverified	0
Pretraining Multi-Speaker Identification for Neural Speaker Diarization	May 30, 2025	speaker-diarizationSpeaker Diarization	—Unverified	0
Private kNN-VC: Interpretable Anonymization of Converted Speech	May 23, 2025	Speaker anonymizationSpeaker Recognition	CodeCode Available	0
SEED: Speaker Embedding Enhancement Diffusion Model	May 22, 2025	modelSpeaker Recognition	CodeCode Available	2
Analysis of ABC Frontend Audio Systems for the NIST-SRE24	May 21, 2025	Speaker Recognition	—Unverified	0
SoCov: Semi-Orthogonal Parametric Pooling of Covariance Matrix for Speaker Recognition	Apr 23, 2025	Speaker Recognition	—Unverified	0
From Dialect Gaps to Identity Maps: Tackling Variability in Speaker Verification	Apr 21, 2025	Data AugmentationSpeaker Identification	—Unverified	0
Audio-to-Image Encoding for Improved Voice Characteristic Detection Using Deep Convolutional Neural Networks	Mar 7, 2025	Speaker Recognition	—Unverified	0
Language Modelling for Speaker Diarization in Telephonic Interviews	Jan 28, 2025	Acoustic ModellingLanguage Modelling	—Unverified	0
VoxVietnam: a Large-Scale Multi-Genre Dataset for Vietnamese Speaker Recognition	Dec 31, 2024	DiversitySpeaker Recognition	—Unverified	0
Investigating Prosodic Signatures via Speech Pre-Trained Models for Audio Deepfake Source Attribution	Dec 23, 2024	Audio Deepfake DetectionDeepFake Detection	—Unverified	0
Study on Inter and Intra Speaker Variability in Speaker Recognition	Nov 12, 2024	DiversitySpeaker Recognition	—Unverified	0
Multi-View Multi-Task Modeling with Speech Foundation Models for Speech Forensic Tasks	Oct 16, 2024	Age EstimationEmotion Recognition	—Unverified	0
Investigation of Speaker Representation for Target-Speaker Speech Processing	Oct 15, 2024	Action DetectionActivity Detection	—Unverified	0
The OCON model: an old but green solution for distributable supervised classification for acoustic monitoring in smart cities	Oct 5, 2024	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
Enhancing Open-Set Speaker Identification through Rapid Tuning with Speaker Reciprocal Points and Negative Sample	Sep 24, 2024	Speaker IdentificationSpeaker Recognition	—Unverified	0
Avengers Assemble: Amalgamation of Non-Semantic Features for Depression Detection	Sep 22, 2024	Depression DetectionEmotion Recognition	—Unverified	0
Are Music Foundation Models Better at Singing Voice Deepfake Detection? Far-Better Fuse them with Speech Foundation Models	Sep 21, 2024	DeepFake DetectionFace Swapping	—Unverified	0
Speaker-IPL: Unsupervised Learning of Speaker Characteristics with i-Vector based Pseudo-Labels	Sep 16, 2024	Speaker RecognitionSpeaker Verification	—Unverified	0
oboVox Far Field Speaker Recognition: A Novel Data Augmentation Approach with Pretrained Models	Sep 16, 2024	Data AugmentationSpeaker Recognition	—Unverified	0
Text-To-Speech Synthesis In The Wild	Sep 13, 2024	BenchmarkingSpeaker Recognition	—Unverified	0
USEF-TSE: Universal Speaker Embedding Free Target Speaker Extraction	Sep 4, 2024	Speaker RecognitionSpeech Separation	CodeCode Available	1
Recursive Attentive Pooling for Extracting Speaker Embeddings from Multi-Speaker Recordings	Aug 30, 2024	speaker-diarizationSpeaker Diarization	—Unverified	0
The VoxCeleb Speaker Recognition Challenge: A Retrospective	Aug 27, 2024	Domain AdaptationSpeaker Recognition	—Unverified	0
Convexity-based Pruning of Speech Representation Models	Aug 16, 2024	Keyword SpottingSelf-Supervised Learning	—Unverified	0
Long-Term Conversation Analysis: Privacy-Utility Trade-off under Noise and Reverberation	Aug 1, 2024	Action DetectionActivity Detection	—Unverified	0
VoxSim: A perceptual voice similarity dataset	Jul 26, 2024	BenchmarkingSpeaker Recognition	CodeCode Available	1
Reshape Dimensions Network for Speaker Recognition	Jul 25, 2024	Speaker Recognition	CodeCode Available	2
Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning	Jul 21, 2024	Representation LearningSelf-Supervised Learning	—Unverified	0
Team HYU ASML ROBOVOX SP Cup 2024 System Description	Jul 16, 2024	Data AugmentationSpeaker Recognition	—Unverified	0
VoxBlink2: A 100K+ Speaker Recognition Corpus and the Open-Set Speaker-Identification Benchmark	Jul 16, 2024	DiversitySpeaker Identification	CodeCode Available	5
Phonetic Richness for Improved Automatic Speaker Verification	Jul 10, 2024	Speaker RecognitionSpeaker Verification	—Unverified	0
A voice and speech corpus of patients who underwent upper airway surgery in pre- and post-operative states	Jul 9, 2024	ArticlesClassification	CodeCode Available	0
Analyzing Speech Unit Selection for Textless Speech-to-Speech Translation	Jul 8, 2024	Automatic Speech RecognitionEmotion Recognition	—Unverified	0
We Need Variations in Speech Generation: Sub-center Modelling for Speaker Embeddings	Jul 5, 2024	Speaker RecognitionSpeech Synthesis	—Unverified	0
Prosody-Driven Privacy-Preserving Dementia Detection	Jul 3, 2024	AttributeDiagnostic	CodeCode Available	0
Open-Source Conversational AI with SpeechBrain 1.0	Jun 29, 2024	Language ModelingLanguage Modelling	—Unverified	0
CEC: A Noisy Label Detection Method for Speaker Recognition	Jun 19, 2024	Speaker RecognitionSpeaker Verification	—Unverified	0
Challenging margin-based speaker embedding extractors by using the variational information bottleneck	Jun 18, 2024	Speaker Recognition	—Unverified	0
PERSONA: An Application for Emotion Recognition, Gender Recognition and Age Estimation	Jun 10, 2024	Age EstimationEmotion Recognition	—Unverified	0
The Reasonable Effectiveness of Speaker Embeddings for Violence Detection	Jun 10, 2024	Speaker Recognition	—Unverified	0
Fill in the Gap! Combining Self-supervised Representation Learning with Neural Audio Synthesis for Speech Inpainting	May 30, 2024	Audio SynthesisRepresentation Learning	—Unverified	0

Show:10 25 50

← PrevPage 1 of 9Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	w2v2-aam	EER	1.88	—	Unverified
2	WavLM+ECAPA-TDNN	EER	0.39	—	Unverified