SOTAVerified

Speaker Identification

Papers

Showing 5175 of 248 papers

TitleStatusHype
Privacy-preserving Representation Learning for Speech Understanding0
Advanced accent/dialect identification and accentedness assessment with multi-embedding models and automatic speech recognition0
End-to-end Multichannel Speaker-Attributed ASR: Speaker Guided Decoder and Input Feature Analysis0
InstructERC: Reforming Emotion Recognition in Conversation with Multi-task Retrieval-Augmented Large Language ModelsCode1
Test-Time Training for Speech0
Spiking-LEAF: A Learnable Auditory front-end for Spiking Neural Networks0
Understanding Self-Supervised Learning of Speech Representation via Invariance and Redundancy Reduction0
An Effective Transformer-based Contextual Model and Temporal Gate Pooling for Speaker IdentificationCode0
Read, Look or Listen? What's Needed for Solving a Multimodal Dataset0
Gammatonegram Representation for End-to-End Dysarthric Speech Processing Tasks: Speech Recognition, Speaker Identification, and Intelligibility AssessmentCode0
VoxWatch: An open-set speaker recognition benchmark on VoxCeleb0
Non-uniform Speaker Disentanglement For Depression Detection From Raw Speech SignalsCode1
Meta-Learning Framework for End-to-End Imposter Identification in Unseen Speaker Recognition0
Few-Shot Speaker Identification Using Lightweight Prototypical Network with Feature Grouping and Interaction0
MPCHAT: Towards Multimodal Persona-Grounded ConversationCode1
Ordered and Binary Speaker Embedding0
On the Transferability of Whisper-based Representations for "In-the-Wild" Cross-Task Downstream Speech Applications0
GIFT: Graph-Induced Fine-Tuning for Multi-Party Conversation UnderstandingCode1
Security and Privacy Problems in Voice Assistant Applications: A Survey0
Unsupervised Speech Representation Pooling Using Vector QuantizationCode0
HiSSNet: Sound Event Detection and Speaker Identification via Hierarchical Prototypical Networks for Low-Resource Headphones0
Ensemble knowledge distillation of self-supervised speech models0
ExARN: self-attending RNN for target speaker extraction0
ASiT: Local-Global Audio Spectrogram vIsion Transformer for Event ClassificationCode1
MelHuBERT: A simplified HuBERT on Mel spectrogramsCode1
Show:102550
← PrevPage 3 of 10Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MSM-MAETop-1 (%)96.6Unverified
2M2D/0.6Top-1 (%)96.5Unverified
3M2D/0.7Top-1 (%)96.3Unverified
4M2D ratio=0.6Top-1 (%)94.8Unverified
5AudioMAE (local)Top-1 (%)94.8Unverified
6ATST Base (ours)Top-1 (%)94.3Unverified
7AudioMAE (global)Top-1 (%)94.1Unverified
8AutoSpeech (N=8,C=128)Top-1 (%)87.66Unverified
9SSAST-FRAMETop-1 (%)80.8Unverified
10SSAMBATop-1 (%)70.1Unverified
#ModelMetricClaimedVerifiedStatus
1Fuzzy RetrievalTop-1 (%)67.77Unverified
#ModelMetricClaimedVerifiedStatus
1Fuzzy RetrievalTop-1 (%)80.83Unverified
#ModelMetricClaimedVerifiedStatus
1Fuzzy RetrievalTop-1 (%)95.13Unverified