Speaker Verification

Speaker verification is the verifying the identity of a person from characteristics of the voice.

( Image credit: Contrastive-Predictive-Coding-PyTorch )

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–50 of 746 papers

Title	Date	Tasks	Status	Hype
PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit	May 20, 2022	AllAutomatic Speech Recognition (ASR)	CodeCode Available	6
VoxBlink2: A 100K+ Speaker Recognition Corpus and the Open-Set Speaker-Identification Benchmark	Jul 16, 2024	DiversitySpeaker Identification	CodeCode Available	5
ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models	Jan 30, 2024	Self-Supervised LearningSpeaker Recognition	CodeCode Available	3
Golden Gemini is All You Need: Finding the Sweet Spots for Speaker Verification	Dec 6, 2023	AllSpeaker Verification	CodeCode Available	3
SALMONN: Towards Generic Hearing Abilities for Large Language Models	Oct 20, 2023	Audio captioningAutomatic Speech Recognition	CodeCode Available	3
Pushing the limits of raw waveform speaker recognition	Mar 16, 2022	Self-Supervised LearningSpeaker Recognition	CodeCode Available	3
Magnitude-aware Probabilistic Speaker Embeddings	Feb 28, 2022	Out-of-Distribution DetectionSpeaker Verification	CodeCode Available	3
Ludwig: a type-based declarative deep learning toolbox	Sep 17, 2019	DecoderDeep Learning	CodeCode Available	3
Singer Identity Representation Learning using Self-Supervised Techniques	Jan 10, 2024	Domain GeneralizationRepresentation Learning	CodeCode Available	2
Towards A Unified Conformer Structure: from ASR to ASV Task	Nov 14, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	2
u-HuBERT: Unified Mixed-Modal Speech Pretraining And Zero-Shot Transfer to Unlabeled Modality	Jul 14, 2022	Speaker Verificationspeech-recognition	CodeCode Available	2
Learning Lip-Based Audio-Visual Speaker Embeddings with AV-HuBERT	May 15, 2022	Representation LearningSpeaker Verification	CodeCode Available	2
SEF-PNet: Speaker Encoder-Free Personalized Speech Enhancement with Local and Global Contexts Aggregation	Jan 20, 2025	Speaker VerificationSpeech Enhancement	CodeCode Available	1
ExPO: Explainable Phonetic Trait-Oriented Network for Speaker Verification	Jan 10, 2025	Speaker Verification	CodeCode Available	1
Mitigating Unauthorized Speech Synthesis for Voice Protection	Oct 28, 2024	Data AugmentationFace Swapping	CodeCode Available	1
Malacopula: adversarial automatic speaker verification attacks using a neural-based generalised Hammerstein model	Aug 17, 2024	Adversarial AttackSpeaker Verification	CodeCode Available	1
Multi-Stage Face-Voice Association Learning with Keynote Speaker Diarization	Jul 25, 2024	speaker-diarizationSpeaker Diarization	CodeCode Available	1
Vibravox: A Dataset of French Speech Captured with Body-conduction Audio Sensors	Jul 16, 2024	Automatic Phoneme RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	1
Revisiting and Improving Scoring Fusion for Spoofing-aware Speaker Verification Using Compositional Data Analysis	Jun 16, 2024	Speaker Verification	CodeCode Available	1
MR-RawNet: Speaker verification system with multiple temporal resolutions for variable duration utterances using raw waveforms	Jun 11, 2024	Speaker Verification	CodeCode Available	1
A New Perspective on Speaker Verification: Joint Modeling with DFSMN and Transformer	Dec 28, 2023	Speaker Verification	CodeCode Available	1
NeXt-TDNN: Modernizing Multi-Scale Temporal Convolution Backbone for Speaker Verification	Dec 14, 2023	Speaker Verification	CodeCode Available	1
Diff-SV: A Unified Hierarchical Framework for Noise-Robust Speaker Verification Using Score-Based Diffusion Probabilistic Models	Sep 14, 2023	Speaker VerificationSpeech Enhancement	CodeCode Available	1
Exploring Binary Classification Loss For Speaker Verification	Jul 17, 2023	Binary ClassificationClassification	CodeCode Available	1
Disentanglement in a GAN for Unconditional Speech Synthesis	Jul 4, 2023	DisentanglementGenerative Adversarial Network	CodeCode Available	1
Evaluation of Speech Representations for MOS prediction	Jun 16, 2023	PredictionSelf-Supervised Learning	CodeCode Available	1
One-Step Knowledge Distillation and Fine-Tuning in Using Large Pre-Trained Self-Supervised Learning Models for Speaker Verification	May 27, 2023	Knowledge DistillationSelf-Supervised Learning	CodeCode Available	1
Bts-e: Audio deepfake detection using breathing-talking-silence encoder	May 5, 2023	Audio Deepfake DetectionDeepFake Detection	CodeCode Available	1
CryCeleb: A Speaker Verification Dataset Based on Infant Cry Sounds	May 1, 2023	Speaker Verification	CodeCode Available	1
DS-TDNN: Dual-stream Time-delay Neural Network with Global-aware Filter for Speaker Verification	Mar 20, 2023	Speaker VerificationText-Independent Speaker Verification	CodeCode Available	1
Cross-modal Audio-visual Co-learning for Text-independent Speaker Verification	Feb 22, 2023	Speaker VerificationText-Independent Speaker Verification	CodeCode Available	1
VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge	Feb 20, 2023	Speaker DiarizationSpeaker Recognition	CodeCode Available	1
Cross-modal information fusion for voice spoofing detection	Feb 1, 2023	Automatic Speech Recognitionfake voice detection	CodeCode Available	1
SAMO: Speaker Attractor Multi-Center One-Class Learning for Voice Anti-Spoofing	Nov 4, 2022	DiversitySpeaker Verification	CodeCode Available	1
Voice Spoofing Countermeasures: Taxonomy, State-of-the-art, experimental analysis of generalizability, open challenges, and the way forward	Oct 2, 2022	MisinformationSpeaker Verification	CodeCode Available	1
The 2022 Far-field Speaker Verification Challenge: Exploring domain mismatch and semi-supervised learning under the far-field scenario	Sep 12, 2022	Speaker Verification	CodeCode Available	1
DeID-VC: Speaker De-identification via Zero-shot Pseudo Voice Conversion	Sep 9, 2022	De-identificationSpeaker Verification	CodeCode Available	1
IndicSUPERB: A Speech Processing Universal Performance Benchmark for Indian languages	Aug 24, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	1
Non-Contrastive Self-Supervised Learning of Utterance-Level Speech Representations	Aug 10, 2022	Emotion RecognitionSelf-Supervised Learning	CodeCode Available	1
Cross-Age Speaker Verification: Learning Age-Invariant Speaker Embeddings	Jul 13, 2022	Age EstimationSpeaker Verification	CodeCode Available	1
Extended U-Net for Speaker Verification in Noisy Environments	Jun 27, 2022	DenoisingSpeaker Identification	CodeCode Available	1
Frequency and Multi-Scale Selective Kernel Attention for Speaker Verification	Apr 3, 2022	Speaker Verification	CodeCode Available	1
Robust Disentangled Variational Speech Representation Learning for Zero-shot Voice Conversion	Mar 30, 2022	Data AugmentationDecoder	CodeCode Available	1
LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT	Mar 29, 2022	AllAutomatic Speech Recognition	CodeCode Available	1
The VoicePrivacy 2022 Challenge Evaluation Plan	Mar 23, 2022	Speaker Verification	CodeCode Available	1
A^3T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editing	Mar 18, 2022	Representation LearningSpeaker Verification	CodeCode Available	1
Explainable deepfake and spoofing detection: an attack analysis using SHapley Additive exPlanations	Feb 28, 2022	Face SwappingSpeaker Verification	CodeCode Available	1
Automatic speaker verification spoofing and deepfake detection using wav2vec 2.0 and data augmentation	Feb 24, 2022	Audio Deepfake DetectionData Augmentation	CodeCode Available	1
A Probabilistic Fusion Framework for Spoofing Aware Speaker Verification	Feb 10, 2022	Speaker Verification	CodeCode Available	1
Bias in Automated Speaker Recognition	Jan 24, 2022	BIG-bench Machine LearningFace Recognition	CodeCode Available	1

Show:10 25 50

← PrevPage 1 of 15Next →

All datasets VoxCeleb VoxCeleb1 CALLHOME CN-CELEB ASVspoof 2019 - LA VibraVox (forehead accelerometer)VibraVox (headset microphone)VibraVox (rigid in-ear microphone)VibraVox (soft in-ear microphone)VibraVox (temple vibration pickup)VibraVox (throat microphone)VoxCeleb2

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Multi Task SSL	EER	1.98	—	Unverified
2	ReDimNet-B0-LM (1.0M)	EER	1.16	—	Unverified
3	TitanNet -S	EER	1.15	—	Unverified
4	ReDimNet-B0-LM-ASNorm (1.0M)	EER	1.07	—	Unverified
5	SpeechNAS	EER	1.02	—	Unverified
6	ReDimNet-B1-LM (2.2M)	EER	0.85	—	Unverified
7	TitanNet -M	EER	0.81	—	Unverified
8	ReDimNet-B1-LM-ASNorm (2.2M)	EER	0.73	—	Unverified
9	TitanNet -L	EER	0.68	—	Unverified
10	ReDimNet-B2-SF2-LM (4.7M)	EER	0.57	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Fine-tuned HuBERT Large	EER	2.36	—	Unverified
2	ReDimNet-B0-LM (1.0M)	EER	1.16	—	Unverified
3	ReDimNet-B0-LM-ASNorm (1.0M)	EER	1.07	—	Unverified
4	SpeechNAS	EER	1.02	—	Unverified
5	ReDimNet-B1-LM (2.2M)	EER	0.85	—	Unverified
6	ReDimNet-B1-LM-ASNorm (2.2M)	EER	0.73	—	Unverified
7	ReDimNet-B2-SF2-LM (4.7M)	EER	0.57	—	Unverified
8	ReDimNet-B2-SF2-LM-ASNorm (4.7M)	EER	0.52	—	Unverified
9	ReDimNet-B4-LM (6.3M)	EER	0.51	—	Unverified
10	ReDimNet-B3-LM (3.0M)	EER	0.5	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	GE2E	Cosine EER	3.55	—	Unverified
2		Cosine EER	2.38	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ResNet with Attention Backend	EER	10.77	—	Unverified
2	X-Vectors with Attention Backend	EER	10.12	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ECAPA-TDNN	minDCF	0	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ECAPA2	Test EER	0.01	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ECAPA2	Test EER	0	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ECAPA2	Test EER	0.03	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ECAPA2	Test EER	0.02	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ECAPA2	Test EER	0.08	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ECAPA2	Test EER	0.04	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ResNet-50	EER	100	—	Unverified