SOTAVerified

Speaker Identification

Papers

Showing 5175 of 248 papers

TitleStatusHype
Attention-based multi-task learning for speech-enhancement and speaker-identification in multi-speaker dialogue scenarioCode0
Cross-Lingual Speaker Identification Using Distant SupervisionCode0
PF-Net: Personalized Filter for Speaker Recognition from Raw WaveformCode0
Deep Speaker: an End-to-End Neural Speaker Embedding SystemCode0
Contrastive Learning of General-Purpose Audio RepresentationsCode0
PL-EESR: Perceptual Loss Based END-TO-END Robust Speaker Representation ExtractionCode0
On Learning Associations of Faces and VoicesCode0
Compositional embedding models for speaker identification and diarization with simultaneous speech from 2+ speakersCode0
An Effective Transformer-based Contextual Model and Temporal Gate Pooling for Speaker IdentificationCode0
Towards Making the Most of Dialogue Characteristics for Neural Chat TranslationCode0
Compositional Clustering: Applications to Multi-Label Object Recognition and Speaker IdentificationCode0
SIG: Speaker Identification in Literature via Prompt-Based GenerationCode0
CoLMbo: Speaker Language Model for Descriptive ProfilingCode0
A Generative Product-of-Filters Model of AudioCode0
Latent space representation for multi-target speaker detection and identification with a sparse dataset using Triplet neural networksCode0
Masked Modeling Duo: Learning Representations by Encouraging Both Networks to Model the InputCode0
EVI: Multilingual Spoken Dialogue Tasks and Dataset for Knowledge-Based Enrolment, Verification, and IdentificationCode0
Identifying Speakers in Dialogue Transcripts: A Text-based Approach Using Pretrained Language ModelsCode0
Identify Speakers in Cocktail Parties with End-to-End AttentionCode0
Friends-MMC: A Dataset for Multi-modal Multi-party Conversation UnderstandingCode0
Gammatonegram Representation for End-to-End Dysarthric Speech Processing Tasks: Speech Recognition, Speaker Identification, and Intelligibility AssessmentCode0
Just ASR + LLM? A Study on Speech Large Language Models' Ability to Identify and Understand Speaker in Spoken DialogueCode0
A domain-agnostic approach for opinion prediction on speechCode0
Masked Modeling Duo: Towards a Universal Audio Pre-training FrameworkCode0
End-to-End Diarization for Variable Number of Speakers with Local-Global Networks and Discriminative Speaker Embeddings0
Show:102550
← PrevPage 3 of 10Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MSM-MAETop-1 (%)96.6Unverified
2M2D/0.6Top-1 (%)96.5Unverified
3M2D/0.7Top-1 (%)96.3Unverified
4M2D ratio=0.6Top-1 (%)94.8Unverified
5AudioMAE (local)Top-1 (%)94.8Unverified
6ATST Base (ours)Top-1 (%)94.3Unverified
7AudioMAE (global)Top-1 (%)94.1Unverified
8AutoSpeech (N=8,C=128)Top-1 (%)87.66Unverified
9SSAST-FRAMETop-1 (%)80.8Unverified
10SSAMBATop-1 (%)70.1Unverified
#ModelMetricClaimedVerifiedStatus
1Fuzzy RetrievalTop-1 (%)67.77Unverified
#ModelMetricClaimedVerifiedStatus
1Fuzzy RetrievalTop-1 (%)80.83Unverified
#ModelMetricClaimedVerifiedStatus
1Fuzzy RetrievalTop-1 (%)95.13Unverified