Speaker Identification

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 226–248 of 248 papers

Title	Date	Tasks	Status
Identify Speakers in Cocktail Parties with End-to-End Attention	May 22, 2020	Speaker IdentificationSpeech Separation	CodeCode Available
EVI: Multilingual Spoken Dialogue Tasks and Dataset for Knowledge-Based Enrolment, Verification, and Identification	Apr 28, 2022	Speaker IdentificationSpeaker Verification	CodeCode Available
CoLMbo: Speaker Language Model for Descriptive Profiling	Jun 11, 2025	DescriptiveLanguage Modeling	CodeCode Available
Audio ALBERT: A Lite BERT for Self-supervised Learning of Audio Representation	May 18, 2020	Self-Supervised LearningSpeaker Identification	CodeCode Available
Cross-Lingual Speaker Identification Using Distant Supervision	Oct 11, 2022	Language ModelingLanguage Modelling	CodeCode Available
A domain-agnostic approach for opinion prediction on speech	Dec 1, 2016	Emotion RecognitionFeature Engineering	CodeCode Available
Contrastive Learning of General-Purpose Audio Representations	Oct 21, 2020	CoLAContrastive Learning	CodeCode Available
Identifying Speakers in Dialogue Transcripts: A Text-based Approach Using Pretrained Language Models	Jul 16, 2024	AttributeSpeaker Identification	CodeCode Available
On Learning Associations of Faces and Voices	May 15, 2018	Speaker Identification	CodeCode Available
Towards Making the Most of Dialogue Characteristics for Neural Chat Translation	Sep 2, 2021	Machine TranslationResponse Generation	CodeCode Available
Word-level Embeddings for Cross-Task Transfer Learning in Speech Processing	Oct 22, 2019	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available
PF-Net: Personalized Filter for Speaker Recognition from Raw Waveform	May 31, 2021	Speaker IdentificationSpeaker Recognition	CodeCode Available
Attention-based multi-task learning for speech-enhancement and speaker-identification in multi-speaker dialogue scenario	Jan 7, 2021	Multi-Task LearningSpeaker Identification	CodeCode Available
Towards Speaker Identification with Minimal Dataset and Constrained Resources using 1D-Convolution Neural Network	Nov 22, 2024	Data AugmentationSpeaker Identification	CodeCode Available
Gammatonegram Representation for End-to-End Dysarthric Speech Processing Tasks: Speech Recognition, Speaker Identification, and Intelligibility Assessment	Jul 6, 2023	Speaker Identificationspeech-recognition	CodeCode Available
Compositional embedding models for speaker identification and diarization with simultaneous speech from 2+ speakers	Oct 22, 2020	speaker-diarizationSpeaker Diarization	CodeCode Available
PL-EESR: Perceptual Loss Based END-TO-END Robust Speaker Representation Extraction	Oct 3, 2021	Speaker IdentificationSpeaker Verification	CodeCode Available
Compositional Clustering: Applications to Multi-Label Object Recognition and Speaker Identification	Sep 9, 2021	ClusteringFew-Shot Learning	CodeCode Available
Friends-MMC: A Dataset for Multi-modal Multi-party Conversation Understanding	Dec 23, 2024	Speaker Identification	CodeCode Available
An Effective Transformer-based Contextual Model and Temporal Gate Pooling for Speaker Identification	Aug 22, 2023	Self-Supervised LearningSpeaker Identification	CodeCode Available
Deep Speaker: an End-to-End Neural Speaker Embedding System	May 5, 2017	ClusteringSpeaker Identification	CodeCode Available
A Generative Product-of-Filters Model of Audio	Dec 20, 2013	modelSpeaker Identification	CodeCode Available
Unsupervised Speech Representation Pooling Using Vector Quantization	Apr 8, 2023	Emotion Recognitionintent-classification	CodeCode Available

Show:10 25 50

← PrevPage 10 of 10Next →

All datasets VoxCeleb1 EVI en-GB EVI fr-FR EVI pl-PL

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	MSM-MAE	Top-1 (%)	96.6	—	Unverified
2	M2D/0.6	Top-1 (%)	96.5	—	Unverified
3	M2D/0.7	Top-1 (%)	96.3	—	Unverified
4	M2D ratio=0.6	Top-1 (%)	94.8	—	Unverified
5	AudioMAE (local)	Top-1 (%)	94.8	—	Unverified
6	ATST Base (ours)	Top-1 (%)	94.3	—	Unverified
7	AudioMAE (global)	Top-1 (%)	94.1	—	Unverified
8	AutoSpeech (N=8,C=128)	Top-1 (%)	87.66	—	Unverified
9	SSAST-FRAME	Top-1 (%)	80.8	—	Unverified
10	SSAMBA	Top-1 (%)	70.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Fuzzy Retrieval	Top-1 (%)	67.77	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Fuzzy Retrieval	Top-1 (%)	80.83	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Fuzzy Retrieval	Top-1 (%)	95.13	—	Unverified