SOTAVerified

Speaker Identification

Papers

Showing 201248 of 248 papers

TitleStatusHype
Speaker and Posture Classification using Instantaneous Intraspeech Breathing Features0
Speaker attribution with voice profiles by graph-based semi-supervised learning0
Speaker Diarization and Identification from Single-Channel Classroom Audio Recording Using Virtual Microphones0
Speaker Fuzzy Fingerprints: Benchmarking Text-Based Identification in Multiparty Dialogues0
Speaker Identification Experiments Under Gender De-Identification0
Speaker Identification from emotional and noisy speech data using learned voice segregation and Speech VGG0
Speaker identification from the sound of the human breath0
Speaker Identification From Youtube Obtained Data0
Speaker Identification in each of the Neutral and Shouted Talking Environments based on Gender-Dependent Approach Using SPHMMs0
Speaker Identification using EEG0
Speaker Identification using Speech Recognition0
Speaker Recognition in Bengali Language from Nonlinear Features0
Meta-Learning Framework for End-to-End Imposter Identification in Unseen Speaker Recognition0
Speech Enhancement using Self-Adaptation and Multi-Head Self-Attention0
Speech-FT: Merging Pre-trained And Fine-Tuned Speech Representation Models For Cross-Task Generalization0
Speech Rhythm-Based Speaker Embeddings Extraction from Phonemes and Phoneme Duration for Multi-Speaker Speech Synthesis0
Speech Unlearning0
Speech watermarking: an approach for the forensic analysis of digital telephonic recordings0
Masked Modeling Duo: Towards a Universal Audio Pre-training FrameworkCode0
Unsupervised Speech Representation Pooling Using Vector QuantizationCode0
Masked Modeling Duo: Learning Representations by Encouraging Both Networks to Model the InputCode0
Latent space representation for multi-target speaker detection and identification with a sparse dataset using Triplet neural networksCode0
Just ASR + LLM? A Study on Speech Large Language Models' Ability to Identify and Understand Speaker in Spoken DialogueCode0
SIG: Speaker Identification in Literature via Prompt-Based GenerationCode0
Deep Learning for Speaker Identification: Architectural Insights from AB-1 Corpus Analysis and Performance EvaluationCode0
Identify Speakers in Cocktail Parties with End-to-End AttentionCode0
Identifying Speakers in Dialogue Transcripts: A Text-based Approach Using Pretrained Language ModelsCode0
Delving into VoxCeleb: environment invariant speaker recognitionCode0
CoLMbo: Speaker Language Model for Descriptive ProfilingCode0
Audio ALBERT: A Lite BERT for Self-supervised Learning of Audio RepresentationCode0
Cross-Lingual Speaker Identification Using Distant SupervisionCode0
A domain-agnostic approach for opinion prediction on speechCode0
Contrastive Learning of General-Purpose Audio RepresentationsCode0
Gammatonegram Representation for End-to-End Dysarthric Speech Processing Tasks: Speech Recognition, Speaker Identification, and Intelligibility AssessmentCode0
On Learning Associations of Faces and VoicesCode0
Towards Making the Most of Dialogue Characteristics for Neural Chat TranslationCode0
Word-level Embeddings for Cross-Task Transfer Learning in Speech ProcessingCode0
PF-Net: Personalized Filter for Speaker Recognition from Raw WaveformCode0
Attention-based multi-task learning for speech-enhancement and speaker-identification in multi-speaker dialogue scenarioCode0
Towards Speaker Identification with Minimal Dataset and Constrained Resources using 1D-Convolution Neural NetworkCode0
Friends-MMC: A Dataset for Multi-modal Multi-party Conversation UnderstandingCode0
Compositional embedding models for speaker identification and diarization with simultaneous speech from 2+ speakersCode0
PL-EESR: Perceptual Loss Based END-TO-END Robust Speaker Representation ExtractionCode0
Compositional Clustering: Applications to Multi-Label Object Recognition and Speaker IdentificationCode0
EVI: Multilingual Spoken Dialogue Tasks and Dataset for Knowledge-Based Enrolment, Verification, and IdentificationCode0
An Effective Transformer-based Contextual Model and Temporal Gate Pooling for Speaker IdentificationCode0
Deep Speaker: an End-to-End Neural Speaker Embedding SystemCode0
A Generative Product-of-Filters Model of AudioCode0
Show:102550
← PrevPage 5 of 5Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MSM-MAETop-1 (%)96.6Unverified
2M2D/0.6Top-1 (%)96.5Unverified
3M2D/0.7Top-1 (%)96.3Unverified
4M2D ratio=0.6Top-1 (%)94.8Unverified
5AudioMAE (local)Top-1 (%)94.8Unverified
6ATST Base (ours)Top-1 (%)94.3Unverified
7AudioMAE (global)Top-1 (%)94.1Unverified
8AutoSpeech (N=8,C=128)Top-1 (%)87.66Unverified
9SSAST-FRAMETop-1 (%)80.8Unverified
10SSAMBATop-1 (%)70.1Unverified
#ModelMetricClaimedVerifiedStatus
1Fuzzy RetrievalTop-1 (%)67.77Unverified
#ModelMetricClaimedVerifiedStatus
1Fuzzy RetrievalTop-1 (%)80.83Unverified
#ModelMetricClaimedVerifiedStatus
1Fuzzy RetrievalTop-1 (%)95.13Unverified