SOTAVerified

Speaker Identification

Papers

Showing 125 of 248 papers

TitleStatusHype
CoLMbo: Speaker Language Model for Descriptive ProfilingCode0
Rhythm Features for Speaker Identification0
French Listening Tests for the Assessment of Intelligibility, Quality, and Identity of Body-Conducted Speech Enhancement0
Speech Unlearning0
Pretraining Multi-Speaker Identification for Neural Speaker Diarization0
REWIND: Speech Time Reversal for Enhancing Speaker Representations in Diffusion-based Voice Conversion0
HPP-Voice: A Large-Scale Evaluation of Speech Embeddings for Multi-Phenotypic Classification0
Quantized Approximate Signal Processing (QASP): Towards Homomorphic Encryption for audio0
From Dialect Gaps to Identity Maps: Tackling Variability in Speaker Verification0
Speaker Fuzzy Fingerprints: Benchmarking Text-Based Identification in Multiparty Dialogues0
Whisper Speaker Identification: Leveraging Pre-Trained Multilingual Transformers for Robust Speaker EmbeddingsCode1
Speech-FT: Merging Pre-trained And Fine-Tuned Speech Representation Models For Cross-Task Generalization0
A Preliminary Exploration with GPT-4o Voice Mode0
SCDiar: a streaming diarization system based on speaker change detection and speech recognition0
Characteristic-Specific Partial Fine-Tuning for Efficient Emotion and Speaker Adaptation in Codec Language Text-to-Speech Models0
PolInterviews -- A Dataset of German Politician Public Broadcast Interviews0
Friends-MMC: A Dataset for Multi-modal Multi-party Conversation UnderstandingCode0
Machine Unlearning reveals that the Gender-based Violence Victim Condition can be detected from Speech in a Speaker-Agnostic Setting0
Towards Speaker Identification with Minimal Dataset and Constrained Resources using 1D-Convolution Neural NetworkCode0
Towards Advanced Speech Signal Processing: A Statistical Perspective on Convolution-Based Architectures and its Applications0
Incorporating Talker Identity Aids With Improving Speech Recognition in Adversarial Environments0
Disentangling Textual and Acoustic Features of Neural Speech RepresentationsCode1
Enhancing Open-Set Speaker Identification through Rapid Tuning with Speaker Reciprocal Points and Negative Sample0
ComiCap: A VLMs pipeline for dense captioning of Comic PanelsCode1
Exploring VQ-VAE with Prosody Parameters for Speaker Anonymization0
Show:102550
← PrevPage 1 of 10Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1MSM-MAETop-1 (%)96.6Unverified
2M2D/0.6Top-1 (%)96.5Unverified
3M2D/0.7Top-1 (%)96.3Unverified
4M2D ratio=0.6Top-1 (%)94.8Unverified
5AudioMAE (local)Top-1 (%)94.8Unverified
6ATST Base (ours)Top-1 (%)94.3Unverified
7AudioMAE (global)Top-1 (%)94.1Unverified
8AutoSpeech (N=8,C=128)Top-1 (%)87.66Unverified
9SSAST-FRAMETop-1 (%)80.8Unverified
10SSAMBATop-1 (%)70.1Unverified
#ModelMetricClaimedVerifiedStatus
1Fuzzy RetrievalTop-1 (%)67.77Unverified
#ModelMetricClaimedVerifiedStatus
1Fuzzy RetrievalTop-1 (%)80.83Unverified
#ModelMetricClaimedVerifiedStatus
1Fuzzy RetrievalTop-1 (%)95.13Unverified