SOTAVerified

Speaker Recognition

Speaker Recognition is the process of identifying or confirming the identity of a person given his speech segments.

Source: Margin Matters: Towards More Discriminative Deep Neural Network Embeddings for Speaker Recognition

Papers

Showing 125 of 435 papers

TitleStatusHype
An Exploration of ECAPA-TDNN and x-vector Speaker Representations in Zero-shot Multi-speaker TTS0
A Comparative Evaluation of Deep Learning Models for Speech Enhancement in Real-World Noisy Environments0
CoLMbo: Speaker Language Model for Descriptive ProfilingCode0
Learning Speaker-Invariant Visual Features for Lipreading0
Rhythm Features for Speaker Identification0
Synthetic Speech Source Tracing using Metric Learning0
LASPA: Language Agnostic Speaker Disentanglement with Prefix-Tuned Cross-Attention0
Investigating the Reasonable Effectiveness of Speaker Pre-Trained Models and their Synergistic Power for SingMOS Prediction0
Source Tracing of Synthetic Speech Systems Through Paralinguistic Pre-Trained Representations0
Pretraining Multi-Speaker Identification for Neural Speaker Diarization0
Private kNN-VC: Interpretable Anonymization of Converted SpeechCode0
SEED: Speaker Embedding Enhancement Diffusion ModelCode2
Analysis of ABC Frontend Audio Systems for the NIST-SRE240
SoCov: Semi-Orthogonal Parametric Pooling of Covariance Matrix for Speaker Recognition0
From Dialect Gaps to Identity Maps: Tackling Variability in Speaker Verification0
Audio-to-Image Encoding for Improved Voice Characteristic Detection Using Deep Convolutional Neural Networks0
Language Modelling for Speaker Diarization in Telephonic Interviews0
VoxVietnam: a Large-Scale Multi-Genre Dataset for Vietnamese Speaker Recognition0
Investigating Prosodic Signatures via Speech Pre-Trained Models for Audio Deepfake Source Attribution0
Study on Inter and Intra Speaker Variability in Speaker Recognition0
Multi-View Multi-Task Modeling with Speech Foundation Models for Speech Forensic Tasks0
Investigation of Speaker Representation for Target-Speaker Speech Processing0
The OCON model: an old but green solution for distributable supervised classification for acoustic monitoring in smart cities0
Enhancing Open-Set Speaker Identification through Rapid Tuning with Speaker Reciprocal Points and Negative Sample0
Avengers Assemble: Amalgamation of Non-Semantic Features for Depression Detection0
Show:102550
← PrevPage 1 of 18Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1w2v2-aamEER1.88Unverified
2WavLM+ECAPA-TDNNEER0.39Unverified