Speaker Identification

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 51–100 of 248 papers

Title	Date	Tasks	Status	Hype
Privacy-preserving Representation Learning for Speech Understanding	Oct 26, 2023	ClassificationEmotion Recognition	—Unverified	0
Advanced accent/dialect identification and accentedness assessment with multi-embedding models and automatic speech recognition	Oct 17, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
End-to-end Multichannel Speaker-Attributed ASR: Speaker Guided Decoder and Input Feature Analysis	Oct 16, 2023	Automatic Speech RecognitionDecoder	—Unverified	0
InstructERC: Reforming Emotion Recognition in Conversation with Multi-task Retrieval-Augmented Large Language Models	Sep 21, 2023	Emotion RecognitionEmotion Recognition in Conversation	CodeCode Available	1
Test-Time Training for Speech	Sep 19, 2023	parameter-efficient fine-tuningSpeaker Identification	—Unverified	0
Spiking-LEAF: A Learnable Auditory front-end for Spiking Neural Networks	Sep 18, 2023	Keyword SpottingSpeaker Identification	—Unverified	0
Understanding Self-Supervised Learning of Speech Representation via Invariance and Redundancy Reduction	Sep 7, 2023	Keyword SpottingSelf-Supervised Learning	—Unverified	0
An Effective Transformer-based Contextual Model and Temporal Gate Pooling for Speaker Identification	Aug 22, 2023	Self-Supervised LearningSpeaker Identification	CodeCode Available	0
Gammatonegram Representation for End-to-End Dysarthric Speech Processing Tasks: Speech Recognition, Speaker Identification, and Intelligibility Assessment	Jul 6, 2023	Speaker Identificationspeech-recognition	CodeCode Available	0
Read, Look or Listen? What's Needed for Solving a Multimodal Dataset	Jul 6, 2023	Question AnsweringSpeaker Identification	—Unverified	0
VoxWatch: An open-set speaker recognition benchmark on VoxCeleb	Jun 30, 2023	Speaker IdentificationSpeaker Recognition	—Unverified	0
Non-uniform Speaker Disentanglement For Depression Detection From Raw Speech Signals	Jun 2, 2023	Depression DetectionDisentanglement	CodeCode Available	1
Meta-Learning Framework for End-to-End Imposter Identification in Unseen Speaker Recognition	Jun 1, 2023	Meta-LearningSpeaker Identification	—Unverified	0
Few-Shot Speaker Identification Using Lightweight Prototypical Network with Feature Grouping and Interaction	May 31, 2023	Speaker Identification	—Unverified	0
MPCHAT: Towards Multimodal Persona-Grounded Conversation	May 27, 2023	Speaker Identification	CodeCode Available	1
Ordered and Binary Speaker Embedding	May 25, 2023	ClusteringRetrieval	—Unverified	0
On the Transferability of Whisper-based Representations for "In-the-Wild" Cross-Task Downstream Speech Applications	May 23, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
GIFT: Graph-Induced Fine-Tuning for Multi-Party Conversation Understanding	May 16, 2023	Speaker Identification	CodeCode Available	1
Security and Privacy Problems in Voice Assistant Applications: A Survey	Apr 19, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
Unsupervised Speech Representation Pooling Using Vector Quantization	Apr 8, 2023	Emotion Recognitionintent-classification	CodeCode Available	0
HiSSNet: Sound Event Detection and Speaker Identification via Hierarchical Prototypical Networks for Low-Resource Headphones	Mar 13, 2023	Event DetectionSound Event Detection	—Unverified	0
Ensemble knowledge distillation of self-supervised speech models	Feb 24, 2023	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
ExARN: self-attending RNN for target speaker extraction	Dec 2, 2022	Speaker IdentificationTarget Speaker Extraction	—Unverified	0
ASiT: Local-Global Audio Spectrogram vIsion Transformer for Event Classification	Nov 23, 2022	Keyword SpottingSelf-Supervised Learning	CodeCode Available	1
MelHuBERT: A simplified HuBERT on Mel spectrograms	Nov 17, 2022	Automatic Speech RecognitionSelf-Supervised Learning	CodeCode Available	1
Multi-Label Training for Text-Independent Speaker Identification	Nov 14, 2022	Ensemble LearningSpeaker Identification	—Unverified	0
Privacy-Utility Balanced Voice De-Identification Using Adversarial Examples	Nov 10, 2022	De-identificationSpeaker Identification	—Unverified	0
Symmetric Saliency-based Adversarial Attack To Speaker Identification	Oct 30, 2022	Adversarial AttackDecoder	—Unverified	0
Masked Modeling Duo: Learning Representations by Encouraging Both Networks to Model the Input	Oct 26, 2022	Audio ClassificationAudio Tagging	—Unverified	0
Speaker Identification from emotional and noisy speech data using learned voice segregation and Speech VGG	Oct 23, 2022	Speaker Identification	—Unverified	0
Quantitative Evidence on Overlooked Aspects of Enrollment Speaker Embeddings for Target Speaker Separation	Oct 23, 2022	Speaker IdentificationSpeaker Separation	—Unverified	0
Cross-Lingual Speaker Identification Using Distant Supervision	Oct 11, 2022	Language ModelingLanguage Modelling	CodeCode Available	0
Text Independent Speaker Identification System for Access Control	Sep 26, 2022	Speaker Identification	—Unverified	0
Computing with Hypervectors for Efficient Speaker Identification	Aug 28, 2022	CPUQuantization	—Unverified	0
IndicSUPERB: A Speech Processing Universal Performance Benchmark for Indian languages	Aug 24, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	1
Deep versus Wide: An Analysis of Student Architectures for Task-Agnostic Knowledge Distillation of Self-Supervised Speech Models	Jul 14, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
Masked Autoencoders that Listen	Jul 13, 2022	Audio ClassificationDecoder	CodeCode Available	1
Graph-based Multi-View Fusion and Local Adaptation: Mitigating Within-Household Confusability for Speaker Identification	Jul 8, 2022	FairnessSpeaker Identification	—Unverified	0
End-to-End Chinese Speaker Identification	Jul 1, 2022	coreference-resolutionCoreference Resolution	CodeCode Available	1
Speaker Diarization and Identification from Single-Channel Classroom Audio Recording Using Virtual Microphones	Jul 1, 2022	speaker-diarizationSpeaker Diarization	—Unverified	0
iEmoTTS: Toward Robust Cross-Speaker Emotion Transfer and Control for Speech Synthesis based on Disentanglement between Prosody and Timbre	Jun 29, 2022	DisentanglementSpeaker Identification	—Unverified	0
Extended U-Net for Speaker Verification in Noisy Environments	Jun 27, 2022	DenoisingSpeaker Identification	CodeCode Available	1
Identifying Source Speakers for Voice Conversion based Spoofing Attacks on Speaker Verification Systems	Jun 18, 2022	Speaker IdentificationSpeaker Verification	—Unverified	0
Strategies to Improve Robustness of Target Speech Extraction to Enrollment Variations	Jun 16, 2022	Speaker IdentificationSpeech Extraction	—Unverified	0
Speaker Identification using Speech Recognition	May 29, 2022	Speaker Identificationspeech-recognition	—Unverified	0
PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit	May 20, 2022	AllAutomatic Speech Recognition (ASR)	CodeCode Available	6
Silence is Sweeter Than Speech: Self-Supervised Model Using Silence to Store Speaker Information	May 8, 2022	Self-Supervised LearningSpeaker Identification	—Unverified	0
VFHQ: A High-Quality Dataset and Benchmark for Video Face Super-Resolution	May 6, 2022	BenchmarkingSpeaker Identification	—Unverified	0
EVI: Multilingual Spoken Dialogue Tasks and Dataset for Knowledge-Based Enrolment, Verification, and Identification	Apr 28, 2022	Speaker IdentificationSpeaker Verification	CodeCode Available	0
ATST: Audio Representation Learning with Teacher-Student Transformer	Apr 26, 2022	Audio ClassificationInstrument Recognition	CodeCode Available	1

Show:10 25 50

← PrevPage 2 of 5Next →

All datasets VoxCeleb1 EVI en-GB EVI fr-FR EVI pl-PL

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	MSM-MAE	Top-1 (%)	96.6	—	Unverified
2	M2D/0.6	Top-1 (%)	96.5	—	Unverified
3	M2D/0.7	Top-1 (%)	96.3	—	Unverified
4	M2D ratio=0.6	Top-1 (%)	94.8	—	Unverified
5	AudioMAE (local)	Top-1 (%)	94.8	—	Unverified
6	ATST Base (ours)	Top-1 (%)	94.3	—	Unverified
7	AudioMAE (global)	Top-1 (%)	94.1	—	Unverified
8	AutoSpeech (N=8,C=128)	Top-1 (%)	87.66	—	Unverified
9	SSAST-FRAME	Top-1 (%)	80.8	—	Unverified
10	SSAMBA	Top-1 (%)	70.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Fuzzy Retrieval	Top-1 (%)	67.77	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Fuzzy Retrieval	Top-1 (%)	80.83	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Fuzzy Retrieval	Top-1 (%)	95.13	—	Unverified