Speaker Identification

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 51–100 of 248 papers

Title	Date	Tasks	Status
HPP-Voice: A Large-Scale Evaluation of Speech Embeddings for Multi-Phenotypic Classification	May 22, 2025	speaker-diarizationSpeaker Diarization	—Unverified
Quantized Approximate Signal Processing (QASP): Towards Homomorphic Encryption for audio	May 15, 2025	Speaker Identificationspeech-recognition	—Unverified
Speaker Fuzzy Fingerprints: Benchmarking Text-Based Identification in Multiparty Dialogues	Apr 21, 2025	BenchmarkingSpeaker Identification	—Unverified
From Dialect Gaps to Identity Maps: Tackling Variability in Speaker Verification	Apr 21, 2025	Data AugmentationSpeaker Identification	—Unverified
Speech-FT: Merging Pre-trained And Fine-Tuned Speech Representation Models For Cross-Task Generalization	Feb 18, 2025	Automatic Speech RecognitionSpeaker Identification	—Unverified
A Preliminary Exploration with GPT-4o Voice Mode	Feb 14, 2025	Age ClassificationAudio Deepfake Detection	—Unverified
SCDiar: a streaming diarization system based on speaker change detection and speech recognition	Jan 28, 2025	Change Detectionspeaker-diarization	—Unverified
Characteristic-Specific Partial Fine-Tuning for Efficient Emotion and Speaker Adaptation in Codec Language Text-to-Speech Models	Jan 24, 2025	Emotion ClassificationSpeaker Identification	—Unverified
PolInterviews -- A Dataset of German Politician Public Broadcast Interviews	Jan 8, 2025	Speaker Identification	—Unverified
Friends-MMC: A Dataset for Multi-modal Multi-party Conversation Understanding	Dec 23, 2024	Speaker Identification	CodeCode Available
Machine Unlearning reveals that the Gender-based Violence Victim Condition can be detected from Speech in a Speaker-Agnostic Setting	Nov 27, 2024	Machine UnlearningSpeaker Identification	—Unverified
Towards Speaker Identification with Minimal Dataset and Constrained Resources using 1D-Convolution Neural Network	Nov 22, 2024	Data AugmentationSpeaker Identification	CodeCode Available
Towards Advanced Speech Signal Processing: A Statistical Perspective on Convolution-Based Architectures and its Applications	Nov 20, 2024	Emotion RecognitionSpeaker Identification	—Unverified
Incorporating Talker Identity Aids With Improving Speech Recognition in Adversarial Environments	Oct 7, 2024	Speaker Identificationspeech-recognition	—Unverified
Exploring VQ-VAE with Prosody Parameters for Speaker Anonymization	Sep 24, 2024	DecoderSpeaker anonymization	—Unverified
Enhancing Open-Set Speaker Identification through Rapid Tuning with Speaker Reciprocal Points and Negative Sample	Sep 24, 2024	Speaker IdentificationSpeaker Recognition	—Unverified
How Redundant Is the Transformer Stack in Speech Representation Models?	Sep 10, 2024	Knowledge DistillationSpeaker Identification	—Unverified
A Toolkit for Joint Speaker Diarization and Identification with Application to Speaker-Attributed ASR	Sep 9, 2024	Automatic Speech Recognitionspeaker-diarization	—Unverified
Just ASR + LLM? A Study on Speech Large Language Models' Ability to Identify and Understand Speaker in Spoken Dialogue	Sep 7, 2024	Question AnsweringSpeaker Identification	CodeCode Available
Progressive Residual Extraction based Pre-training for Speech Representation Learning	Aug 31, 2024	Emotion RecognitionRepresentation Learning	—Unverified
Deep Learning for Speaker Identification: Architectural Insights from AB-1 Corpus Analysis and Performance Evaluation	Aug 13, 2024	Speaker Identification	CodeCode Available
Identifying Speakers in Dialogue Transcripts: A Text-based Approach Using Pretrained Language Models	Jul 16, 2024	AttributeSpeaker Identification	CodeCode Available
DASB -- Discrete Audio and Speech Benchmark	Jun 20, 2024	BenchmarkingEmotion Recognition	—Unverified
Evaluating Speaker Identity Coding in Self-supervised Models and Humans	Jun 14, 2024	Speaker Identification	—Unverified
TIMIT Speaker Profiling: A Comparison of Multi-task learning and Single-task learning Approaches	Apr 18, 2024	Age EstimationClassification	—Unverified
Masked Modeling Duo: Towards a Universal Audio Pre-training Framework	Apr 9, 2024	Audio Classification	—Unverified
Removing Speaker Information from Speech Representation using Variable-Length Soft Pooling	Apr 1, 2024	Speaker IdentificationSpeech Synthesis	—Unverified
Hearing-Loss Compensation Using Deep Neural Networks: A Framework and Results From a Listening Test	Mar 15, 2024	Music ClassificationSpeaker Identification	—Unverified
A Closer Look at Wav2Vec2 Embeddings for On-Device Single-Channel Speech Enhancement	Mar 3, 2024	Automatic Speech RecognitionKeyword Spotting	—Unverified
Unraveling Adversarial Examples against Speaker Identification -- Techniques for Attack Detection and Victim Model Classification	Feb 29, 2024	Adversarial AttackClassification	—Unverified
Effect of utterance duration and phonetic content on speaker identification using second-order statistical methods	Feb 26, 2024	Speaker Identification	—Unverified
Significance of Chirp MFCC as a Feature in Speech and Audio Applications	Feb 19, 2024	Music ClassificationSpeaker Identification	—Unverified
Probing Self-supervised Learning Models with Target Speech Extraction	Feb 17, 2024	Self-Supervised LearningSpeaker Identification	—Unverified
Speech Rhythm-Based Speaker Embeddings Extraction from Phonemes and Phoneme Duration for Multi-Speaker Speech Synthesis	Feb 11, 2024	RhythmSpeaker Identification	—Unverified
Post-Training Embedding Alignment for Decoupling Enrollment and Runtime Speaker Recognition Models	Jan 23, 2024	Speaker IdentificationSpeaker Recognition	—Unverified
SIG: Speaker Identification in Literature via Prompt-Based Generation	Dec 22, 2023	Speaker Identification	CodeCode Available
Voxceleb-ESP: preliminary experiments detecting Spanish celebrities from their voices	Dec 20, 2023	Speaker IdentificationSpeaker Recognition	—Unverified
Efficiency-oriented approaches for self-supervised speech representation learning	Dec 18, 2023	Automatic Speech RecognitionRepresentation Learning	—Unverified
Privacy-preserving Representation Learning for Speech Understanding	Oct 26, 2023	ClassificationEmotion Recognition	—Unverified
Advanced accent/dialect identification and accentedness assessment with multi-embedding models and automatic speech recognition	Oct 17, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified
End-to-end Multichannel Speaker-Attributed ASR: Speaker Guided Decoder and Input Feature Analysis	Oct 16, 2023	Automatic Speech RecognitionDecoder	—Unverified
Test-Time Training for Speech	Sep 19, 2023	parameter-efficient fine-tuningSpeaker Identification	—Unverified
Spiking-LEAF: A Learnable Auditory front-end for Spiking Neural Networks	Sep 18, 2023	Keyword SpottingSpeaker Identification	—Unverified
Understanding Self-Supervised Learning of Speech Representation via Invariance and Redundancy Reduction	Sep 7, 2023	Keyword SpottingSelf-Supervised Learning	—Unverified
An Effective Transformer-based Contextual Model and Temporal Gate Pooling for Speaker Identification	Aug 22, 2023	Self-Supervised LearningSpeaker Identification	CodeCode Available
Gammatonegram Representation for End-to-End Dysarthric Speech Processing Tasks: Speech Recognition, Speaker Identification, and Intelligibility Assessment	Jul 6, 2023	Speaker Identificationspeech-recognition	CodeCode Available
Read, Look or Listen? What's Needed for Solving a Multimodal Dataset	Jul 6, 2023	Question AnsweringSpeaker Identification	—Unverified
VoxWatch: An open-set speaker recognition benchmark on VoxCeleb	Jun 30, 2023	Speaker IdentificationSpeaker Recognition	—Unverified
Meta-Learning Framework for End-to-End Imposter Identification in Unseen Speaker Recognition	Jun 1, 2023	Meta-LearningSpeaker Identification	—Unverified
Few-Shot Speaker Identification Using Lightweight Prototypical Network with Feature Grouping and Interaction	May 31, 2023	Speaker Identification	—Unverified

Show:10 25 50

← PrevPage 2 of 5Next →

All datasets VoxCeleb1 EVI en-GB EVI fr-FR EVI pl-PL

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	MSM-MAE	Top-1 (%)	96.6	—	Unverified
2	M2D/0.6	Top-1 (%)	96.5	—	Unverified
3	M2D/0.7	Top-1 (%)	96.3	—	Unverified
4	M2D ratio=0.6	Top-1 (%)	94.8	—	Unverified
5	AudioMAE (local)	Top-1 (%)	94.8	—	Unverified
6	ATST Base (ours)	Top-1 (%)	94.3	—	Unverified
7	AudioMAE (global)	Top-1 (%)	94.1	—	Unverified
8	AutoSpeech (N=8,C=128)	Top-1 (%)	87.66	—	Unverified
9	SSAST-FRAME	Top-1 (%)	80.8	—	Unverified
10	SSAMBA	Top-1 (%)	70.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Fuzzy Retrieval	Top-1 (%)	67.77	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Fuzzy Retrieval	Top-1 (%)	80.83	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Fuzzy Retrieval	Top-1 (%)	95.13	—	Unverified