SOTAVerified

Speaker Recognition

Speaker Recognition is the process of identifying or confirming the identity of a person given his speech segments.

Source: Margin Matters: Towards More Discriminative Deep Neural Network Embeddings for Speaker Recognition

Papers

Showing 51100 of 435 papers

TitleStatusHype
Utterance-level Aggregation For Speaker Recognition In The WildCode1
Speech and Speaker Recognition from Raw Waveform with SincNetCode1
Frame-level speaker embeddings for text-independent speaker recognition and analysis of end-to-end modelCode1
Speaker Recognition from Raw Waveform with SincNetCode1
An Exploration of ECAPA-TDNN and x-vector Speaker Representations in Zero-shot Multi-speaker TTS0
A Comparative Evaluation of Deep Learning Models for Speech Enhancement in Real-World Noisy Environments0
CoLMbo: Speaker Language Model for Descriptive ProfilingCode0
Learning Speaker-Invariant Visual Features for Lipreading0
Rhythm Features for Speaker Identification0
Synthetic Speech Source Tracing using Metric Learning0
Investigating the Reasonable Effectiveness of Speaker Pre-Trained Models and their Synergistic Power for SingMOS Prediction0
LASPA: Language Agnostic Speaker Disentanglement with Prefix-Tuned Cross-Attention0
Source Tracing of Synthetic Speech Systems Through Paralinguistic Pre-Trained Representations0
Pretraining Multi-Speaker Identification for Neural Speaker Diarization0
Private kNN-VC: Interpretable Anonymization of Converted SpeechCode0
Analysis of ABC Frontend Audio Systems for the NIST-SRE240
SoCov: Semi-Orthogonal Parametric Pooling of Covariance Matrix for Speaker Recognition0
From Dialect Gaps to Identity Maps: Tackling Variability in Speaker Verification0
Audio-to-Image Encoding for Improved Voice Characteristic Detection Using Deep Convolutional Neural Networks0
Language Modelling for Speaker Diarization in Telephonic Interviews0
VoxVietnam: a Large-Scale Multi-Genre Dataset for Vietnamese Speaker Recognition0
Investigating Prosodic Signatures via Speech Pre-Trained Models for Audio Deepfake Source Attribution0
Study on Inter and Intra Speaker Variability in Speaker Recognition0
Multi-View Multi-Task Modeling with Speech Foundation Models for Speech Forensic Tasks0
Investigation of Speaker Representation for Target-Speaker Speech Processing0
The OCON model: an old but green solution for distributable supervised classification for acoustic monitoring in smart cities0
Enhancing Open-Set Speaker Identification through Rapid Tuning with Speaker Reciprocal Points and Negative Sample0
Avengers Assemble: Amalgamation of Non-Semantic Features for Depression Detection0
Are Music Foundation Models Better at Singing Voice Deepfake Detection? Far-Better Fuse them with Speech Foundation Models0
Speaker-IPL: Unsupervised Learning of Speaker Characteristics with i-Vector based Pseudo-Labels0
oboVox Far Field Speaker Recognition: A Novel Data Augmentation Approach with Pretrained Models0
Text-To-Speech Synthesis In The Wild0
Recursive Attentive Pooling for Extracting Speaker Embeddings from Multi-Speaker Recordings0
The VoxCeleb Speaker Recognition Challenge: A Retrospective0
Convexity-based Pruning of Speech Representation Models0
Long-Term Conversation Analysis: Privacy-Utility Trade-off under Noise and Reverberation0
Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning0
Team HYU ASML ROBOVOX SP Cup 2024 System Description0
Phonetic Richness for Improved Automatic Speaker Verification0
A voice and speech corpus of patients who underwent upper airway surgery in pre- and post-operative statesCode0
Analyzing Speech Unit Selection for Textless Speech-to-Speech Translation0
We Need Variations in Speech Generation: Sub-center Modelling for Speaker Embeddings0
Prosody-Driven Privacy-Preserving Dementia DetectionCode0
Open-Source Conversational AI with SpeechBrain 1.00
CEC: A Noisy Label Detection Method for Speaker Recognition0
Challenging margin-based speaker embedding extractors by using the variational information bottleneck0
PERSONA: An Application for Emotion Recognition, Gender Recognition and Age Estimation0
The Reasonable Effectiveness of Speaker Embeddings for Violence Detection0
Fill in the Gap! Combining Self-supervised Representation Learning with Neural Audio Synthesis for Speech Inpainting0
Speaker Characterization by means of Attention Pooling0
Show:102550
← PrevPage 2 of 9Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1w2v2-aamEER1.88Unverified
2WavLM+ECAPA-TDNNEER0.39Unverified