Speaker Identification

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 101–150 of 248 papers

Title	Date	Tasks	Status	Hype
Few-Shot Speaker Identification Using Depthwise Separable Convolutional Network with Channel Attention	Apr 24, 2022	Audio ClassificationFew-Shot Learning	—Unverified	0
WaBERT: A Low-resource End-to-end Model for Spoken Language Understanding and Speech-to-BERT Alignment	Apr 22, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
Listen only to me! How well can target speech extraction handle false alarms?	Apr 11, 2022	Speaker IdentificationSpeaker Verification	—Unverified	0
AdvEst: Adversarial Perturbation Estimation to Classify and Detect Adversarial Attacks against Speaker Identification	Apr 8, 2022	Representation LearningSpeaker Identification	—Unverified	0
Karaoker: Alignment-free singing voice synthesis with speech training data	Apr 8, 2022	Singing Voice SynthesisSpeaker Identification	—Unverified	0
Improved Relation Networks for End-to-End Speaker Verification and Identification	Mar 31, 2022	Meta-LearningRelation	—Unverified	0
Streaming Speaker-Attributed ASR with Token-Level Speaker Embeddings	Mar 30, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	1
NeuraGen-A Low-Resource Neural Network based approach for Gender Classification	Mar 29, 2022	Gender ClassificationSpeaker Identification	—Unverified	0
Speaker Identification Experiments Under Gender De-Identification	Mar 9, 2022	De-identificationSpeaker Identification	—Unverified	0
On the relevance of bandwidth extension for speaker identification	Feb 24, 2022	Bandwidth ExtensionSpeaker Identification	—Unverified	0
openFEAT: Improving Speaker Identification by Open-set Few-shot Embedding Adaptation with Transformer	Feb 24, 2022	Open Set LearningSpeaker Identification	—Unverified	0
Speech watermarking: an approach for the forensic analysis of digital telephonic recordings	Feb 23, 2022	ArticlesSpeaker Identification	—Unverified	0
Tubes Among Us: Analog Attack on Automatic Speaker Identification	Feb 6, 2022	BIG-bench Machine LearningSpeaker Identification	—Unverified	0
Cross-Lingual Speaker Identification from Weak Local Evidence	Jan 16, 2022	Language ModelingLanguage Modelling	—Unverified	0
The exploitation of Multiple Feature Extraction Techniques for Speaker Identification in Emotional States under Disguised Voices	Dec 15, 2021	Speaker IdentificationVoice Conversion	—Unverified	0
SLUE: New Benchmark Tasks for Spoken Language Understanding Evaluation on Natural Speech	Nov 19, 2021	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	1
Target Speech Extraction: Independent Vector Extraction Guided by Supervised Speaker Identification	Nov 5, 2021	Speaker IdentificationSpeech Extraction	—Unverified	0
A Study of Acoustic Features in Arabic Speaker Identification under Noisy Environmental Conditions	Oct 23, 2021	Speaker Identification	—Unverified	0
SSAST: Self-Supervised Audio Spectrogram Transformer	Oct 19, 2021	Audio ClassificationClassification	CodeCode Available	2
SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing	Oct 14, 2021	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	1
UniSpeech-SAT: Universal Speech Representation Learning with Speaker Aware Pre-Training	Oct 12, 2021	Data AugmentationMulti-Task Learning	CodeCode Available	1
PEAF: Learnable Power Efficient Analog Acoustic Features for Audio Recognition	Oct 7, 2021	Action DetectionActivity Detection	—Unverified	0
Transcribe-to-Diarize: Neural Speaker Diarization for Unlimited Number of Speakers using End-to-End Speaker-Attributed ASR	Oct 7, 2021	Action DetectionActivity Detection	—Unverified	0
PL-EESR: Perceptual Loss Based END-TO-END Robust Speaker Representation Extraction	Oct 3, 2021	Speaker IdentificationSpeaker Verification	CodeCode Available	0
Compositional Clustering: Applications to Multi-Label Object Recognition and Speaker Identification	Sep 9, 2021	ClusteringFew-Shot Learning	CodeCode Available	0
Improving Speaker Identification for Shared Devices by Adapting Embeddings to Speaker Subsets	Sep 6, 2021	Speaker Identification	—Unverified	0
FastAudio: A Learnable Audio Front-End for Spoof Speech Detection	Sep 6, 2021	Speaker IdentificationSpeaker Verification	CodeCode Available	1
Towards Making the Most of Dialogue Characteristics for Neural Chat Translation	Sep 2, 2021	Machine TranslationResponse Generation	CodeCode Available	0
QASR: QCRI Aljazeera Speech Resource A Large Scale Annotated Arabic Speech Corpus	Aug 1, 2021	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
A Real-time Speaker Diarization System Based on Spatial Spectrum	Jul 20, 2021	speaker-diarizationSpeaker Diarization	—Unverified	0
Representation Learning to Classify and Detect Adversarial Attacks against Speaker and Speech Recognition Systems	Jul 9, 2021	Representation LearningSpeaker Identification	—Unverified	0
QASR: QCRI Aljazeera Speech Resource -- A Large Scale Annotated Arabic Speech Corpus	Jun 24, 2021	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
Fusion of Embeddings Networks for Robust Combination of Text Dependent and Independent Speaker Recognition	Jun 18, 2021	Speaker IdentificationSpeaker Recognition	—Unverified	0
Graph-based Label Propagation for Semi-Supervised Speaker Identification	Jun 15, 2021	Speaker IdentificationSpeaker Recognition	—Unverified	0
Learning Audio-Visual Dereverberation	Jun 14, 2021	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	1
MPC-BERT: A Pre-Trained Language Model for Multi-Party Conversation Understanding	Jun 3, 2021	Conversational Response SelectionLanguage Modeling	CodeCode Available	1
Supervised Speech Representation Learning for Parkinson's Disease Classification	Jun 1, 2021	ClassificationRepresentation Learning	CodeCode Available	1
PF-Net: Personalized Filter for Speaker Recognition from Raw Waveform	May 31, 2021	Speaker IdentificationSpeaker Recognition	CodeCode Available	0
End-to-End Diarization for Variable Number of Speakers with Local-Global Networks and Discriminative Speaker Embeddings	May 5, 2021	ClusteringSpeaker Identification	—Unverified	0
End-to-End Speaker-Attributed ASR with Transformer	Apr 5, 2021	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
Streaming Multi-talker Speech Recognition with Joint Speaker Identification	Apr 5, 2021	Speaker Identificationspeech-recognition	—Unverified	0
A Survey on Paralinguistics in Tamil Speech Processing	Apr 1, 2021	Emotion RecognitionSpeaker Identification	—Unverified	0
Speech Resynthesis from Discrete Disentangled Self-Supervised Representations	Apr 1, 2021	DisentanglementRepresentation Learning	CodeCode Available	1
Voice Privacy with Smart Digital Assistants in Educational Settings	Mar 24, 2021	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
Blind Speech Separation and Dereverberation using Neural Beamforming	Mar 24, 2021	Speaker IdentificationSpeaker Separation	CodeCode Available	1
Triplet loss based embeddings for forensic speaker identification in Spanish	Feb 24, 2021	Speaker IdentificationTriplet	—Unverified	0
A Modulation-Domain Loss for Neural-Network-based Real-time Speech Enhancement	Feb 15, 2021	Speaker IdentificationSpeech Denoising	CodeCode Available	1
CASA-Based Speaker Identification Using Cascaded GMM-CNN Classifier in Noisy and Emotional Talking Conditions	Feb 11, 2021	Emotion RecognitionSpeaker Identification	—Unverified	0
Speaker attribution with voice profiles by graph-based semi-supervised learning	Feb 6, 2021	Speaker Identification	—Unverified	0
Attention-based multi-task learning for speech-enhancement and speaker-identification in multi-speaker dialogue scenario	Jan 7, 2021	Multi-Task LearningSpeaker Identification	CodeCode Available	0

Show:10 25 50

← PrevPage 3 of 5Next →

All datasets VoxCeleb1 EVI en-GB EVI fr-FR EVI pl-PL

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	MSM-MAE	Top-1 (%)	96.6	—	Unverified
2	M2D/0.6	Top-1 (%)	96.5	—	Unverified
3	M2D/0.7	Top-1 (%)	96.3	—	Unverified
4	M2D ratio=0.6	Top-1 (%)	94.8	—	Unverified
5	AudioMAE (local)	Top-1 (%)	94.8	—	Unverified
6	ATST Base (ours)	Top-1 (%)	94.3	—	Unverified
7	AudioMAE (global)	Top-1 (%)	94.1	—	Unverified
8	AutoSpeech (N=8,C=128)	Top-1 (%)	87.66	—	Unverified
9	SSAST-FRAME	Top-1 (%)	80.8	—	Unverified
10	SSAMBA	Top-1 (%)	70.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Fuzzy Retrieval	Top-1 (%)	67.77	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Fuzzy Retrieval	Top-1 (%)	80.83	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Fuzzy Retrieval	Top-1 (%)	95.13	—	Unverified