Speaker Identification

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 26–50 of 248 papers

Title	Date	Tasks	Status	Hype
How Redundant Is the Transformer Stack in Speech Representation Models?	Sep 10, 2024	Knowledge DistillationSpeaker Identification	—Unverified	0
A Toolkit for Joint Speaker Diarization and Identification with Application to Speaker-Attributed ASR	Sep 9, 2024	Automatic Speech Recognitionspeaker-diarization	—Unverified	0
Just ASR + LLM? A Study on Speech Large Language Models' Ability to Identify and Understand Speaker in Spoken Dialogue	Sep 7, 2024	Question AnsweringSpeaker Identification	CodeCode Available	0
Progressive Residual Extraction based Pre-training for Speech Representation Learning	Aug 31, 2024	Emotion RecognitionRepresentation Learning	—Unverified	0
Deep Learning for Speaker Identification: Architectural Insights from AB-1 Corpus Analysis and Performance Evaluation	Aug 13, 2024	Speaker Identification	CodeCode Available	0
Identifying Speakers in Dialogue Transcripts: A Text-based Approach Using Pretrained Language Models	Jul 16, 2024	AttributeSpeaker Identification	CodeCode Available	0
VoxBlink2: A 100K+ Speaker Recognition Corpus and the Open-Set Speaker-Identification Benchmark	Jul 16, 2024	DiversitySpeaker Identification	CodeCode Available	5
CoMix: A Comprehensive Benchmark for Multi-Task Comic Understanding	Jul 4, 2024	Dialogue Generationobject-detection	CodeCode Available	1
DASB -- Discrete Audio and Speech Benchmark	Jun 20, 2024	BenchmarkingEmotion Recognition	—Unverified	0
Evaluating Speaker Identity Coding in Self-supervised Models and Humans	Jun 14, 2024	Speaker Identification	—Unverified	0
SSAMBA: Self-Supervised Audio Representation Learning with Mamba State Space Model	May 20, 2024	Audio ClassificationGPU	CodeCode Available	2
TIMIT Speaker Profiling: A Comparison of Multi-task learning and Single-task learning Approaches	Apr 18, 2024	Age EstimationClassification	—Unverified	0
Masked Modeling Duo: Towards a Universal Audio Pre-training Framework	Apr 9, 2024	Audio Classification	—Unverified	0
Removing Speaker Information from Speech Representation using Variable-Length Soft Pooling	Apr 1, 2024	Speaker IdentificationSpeech Synthesis	—Unverified	0
Hearing-Loss Compensation Using Deep Neural Networks: A Framework and Results From a Listening Test	Mar 15, 2024	Music ClassificationSpeaker Identification	—Unverified	0
A Closer Look at Wav2Vec2 Embeddings for On-Device Single-Channel Speech Enhancement	Mar 3, 2024	Automatic Speech RecognitionKeyword Spotting	—Unverified	0
Unraveling Adversarial Examples against Speaker Identification -- Techniques for Attack Detection and Victim Model Classification	Feb 29, 2024	Adversarial AttackClassification	—Unverified	0
Effect of utterance duration and phonetic content on speaker identification using second-order statistical methods	Feb 26, 2024	Speaker Identification	—Unverified	0
Significance of Chirp MFCC as a Feature in Speech and Audio Applications	Feb 19, 2024	Music ClassificationSpeaker Identification	—Unverified	0
Probing Self-supervised Learning Models with Target Speech Extraction	Feb 17, 2024	Self-Supervised LearningSpeaker Identification	—Unverified	0
Speech Rhythm-Based Speaker Embeddings Extraction from Phonemes and Phoneme Duration for Multi-Speaker Speech Synthesis	Feb 11, 2024	RhythmSpeaker Identification	—Unverified	0
Post-Training Embedding Alignment for Decoupling Enrollment and Runtime Speaker Recognition Models	Jan 23, 2024	Speaker IdentificationSpeaker Recognition	—Unverified	0
SIG: Speaker Identification in Literature via Prompt-Based Generation	Dec 22, 2023	Speaker Identification	CodeCode Available	0
Voxceleb-ESP: preliminary experiments detecting Spanish celebrities from their voices	Dec 20, 2023	Speaker IdentificationSpeaker Recognition	—Unverified	0
Efficiency-oriented approaches for self-supervised speech representation learning	Dec 18, 2023	Automatic Speech RecognitionRepresentation Learning	—Unverified	0

Show:10 25 50

← PrevPage 2 of 10Next →

All datasets VoxCeleb1 EVI en-GB EVI fr-FR EVI pl-PL

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	MSM-MAE	Top-1 (%)	96.6	—	Unverified
2	M2D/0.6	Top-1 (%)	96.5	—	Unverified
3	M2D/0.7	Top-1 (%)	96.3	—	Unverified
4	M2D ratio=0.6	Top-1 (%)	94.8	—	Unverified
5	AudioMAE (local)	Top-1 (%)	94.8	—	Unverified
6	ATST Base (ours)	Top-1 (%)	94.3	—	Unverified
7	AudioMAE (global)	Top-1 (%)	94.1	—	Unverified
8	AutoSpeech (N=8,C=128)	Top-1 (%)	87.66	—	Unverified
9	SSAST-FRAME	Top-1 (%)	80.8	—	Unverified
10	SSAMBA	Top-1 (%)	70.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Fuzzy Retrieval	Top-1 (%)	67.77	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Fuzzy Retrieval	Top-1 (%)	80.83	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Fuzzy Retrieval	Top-1 (%)	95.13	—	Unverified