SOTAVerified

Speaker Identification

Papers

Showing 51100 of 248 papers

TitleStatusHype
Towards Making the Most of Dialogue Characteristics for Neural Chat TranslationCode0
Unsupervised Speech Representation Pooling Using Vector QuantizationCode0
Contrastive Learning of General-Purpose Audio RepresentationsCode0
Deep Speaker: an End-to-End Neural Speaker Embedding SystemCode0
Towards Speaker Identification with Minimal Dataset and Constrained Resources using 1D-Convolution Neural NetworkCode0
Cross-Lingual Speaker Identification Using Distant SupervisionCode0
Compositional embedding models for speaker identification and diarization with simultaneous speech from 2+ speakersCode0
Compositional Clustering: Applications to Multi-Label Object Recognition and Speaker IdentificationCode0
Friends-MMC: A Dataset for Multi-modal Multi-party Conversation UnderstandingCode0
PF-Net: Personalized Filter for Speaker Recognition from Raw WaveformCode0
SIG: Speaker Identification in Literature via Prompt-Based GenerationCode0
Word-level Embeddings for Cross-Task Transfer Learning in Speech ProcessingCode0
On Learning Associations of Faces and VoicesCode0
CoLMbo: Speaker Language Model for Descriptive ProfilingCode0
Masked Modeling Duo: Towards a Universal Audio Pre-training FrameworkCode0
A Generative Product-of-Filters Model of AudioCode0
EVI: Multilingual Spoken Dialogue Tasks and Dataset for Knowledge-Based Enrolment, Verification, and IdentificationCode0
Just ASR + LLM? A Study on Speech Large Language Models' Ability to Identify and Understand Speaker in Spoken DialogueCode0
Latent space representation for multi-target speaker detection and identification with a sparse dataset using Triplet neural networksCode0
Identifying Speakers in Dialogue Transcripts: A Text-based Approach Using Pretrained Language ModelsCode0
A domain-agnostic approach for opinion prediction on speechCode0
Identify Speakers in Cocktail Parties with End-to-End AttentionCode0
Masked Modeling Duo: Learning Representations by Encouraging Both Networks to Model the InputCode0
PL-EESR: Perceptual Loss Based END-TO-END Robust Speaker Representation ExtractionCode0
End-to-End Diarization for Variable Number of Speakers with Local-Global Networks and Discriminative Speaker Embeddings0
Emirati-Accented Speaker Identification in Stressful Talking Conditions0
Efficiency-oriented approaches for self-supervised speech representation learning0
A user study to compare two conversational assistants designed for people with hearing impairments0
Effect of utterance duration and phonetic content on speaker identification using second-order statistical methods0
Discrimination between Similar Languages, Varieties and Dialects using CNN- and LSTM-based Deep Neural Networks0
A Multi Level Data Fusion Approach for Speaker Identification on Telephone Speech0
Advances in Online Audio-Visual Meeting Transcription0
Deep versus Wide: An Analysis of Student Architectures for Task-Agnostic Knowledge Distillation of Self-Supervised Speech Models0
Deep Neural Networks for Automatic Speech Processing: A Survey from Large Corpora to Limited Data0
DASB -- Discrete Audio and Speech Benchmark0
Curie: A method for protecting SVM Classifier from Poisoning Attack0
A Toolkit for Joint Speaker Diarization and Identification with Application to Speaker-Attributed ASR0
Hypothesis Stitcher for End-to-End Speaker-attributed ASR on Long-form Multi-talker Recordings0
HPP-Voice: A Large-Scale Evaluation of Speech Embeddings for Multi-Phenotypic Classification0
Cross-Lingual Speaker Identification from Weak Local Evidence0
A Survey on Paralinguistics in Tamil Speech Processing0
Advanced Rich Transcription System for Estonian Speech0
Adaptive blind audio source extraction supervised by dominant speaker identification using x-vectors0
How Redundant Is the Transformer Stack in Speech Representation Models?0
How Far Are We from Robust Voice Conversion: A Survey0
H-VECTORS: Utterance-level Speaker Embedding Using A Hierarchical Attention Model0
Cosine similarity-based adversarial process0
Identification of Speakers in Novels0
Identifying Source Speakers for Voice Conversion based Spoofing Attacks on Speaker Verification Systems0
Histogram Transform-based Speaker Identification0
Show:102550
← PrevPage 2 of 5Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MSM-MAETop-1 (%)96.6Unverified
2M2D/0.6Top-1 (%)96.5Unverified
3M2D/0.7Top-1 (%)96.3Unverified
4M2D ratio=0.6Top-1 (%)94.8Unverified
5AudioMAE (local)Top-1 (%)94.8Unverified
6ATST Base (ours)Top-1 (%)94.3Unverified
7AudioMAE (global)Top-1 (%)94.1Unverified
8AutoSpeech (N=8,C=128)Top-1 (%)87.66Unverified
9SSAST-FRAMETop-1 (%)80.8Unverified
10SSAMBATop-1 (%)70.1Unverified
#ModelMetricClaimedVerifiedStatus
1Fuzzy RetrievalTop-1 (%)67.77Unverified
#ModelMetricClaimedVerifiedStatus
1Fuzzy RetrievalTop-1 (%)80.83Unverified
#ModelMetricClaimedVerifiedStatus
1Fuzzy RetrievalTop-1 (%)95.13Unverified