SOTAVerified

Speaker Recognition

Speaker Recognition is the process of identifying or confirming the identity of a person given his speech segments.

Source: Margin Matters: Towards More Discriminative Deep Neural Network Embeddings for Speaker Recognition

Papers

Showing 150 of 435 papers

TitleStatusHype
PaddleSpeech: An Easy-to-Use All-in-One Speech ToolkitCode6
VoxBlink2: A 100K+ Speaker Recognition Corpus and the Open-Set Speaker-Identification BenchmarkCode5
ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf modelsCode3
Take the aTrain. Introducing an Interface for the Accessible Transcription of InterviewsCode3
Leveraging In-the-Wild Data for Effective Self-Supervised Pretraining in Speaker RecognitionCode3
Pushing the limits of raw waveform speaker recognitionCode3
SEED: Speaker Embedding Enhancement Diffusion ModelCode2
Reshape Dimensions Network for Speaker RecognitionCode2
VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram MaskingCode2
USEF-TSE: Universal Speaker Embedding Free Target Speaker ExtractionCode1
VoxSim: A perceptual voice similarity datasetCode1
SpeechGLUE: How Well Can Self-Supervised Speech Models Capture Linguistic Knowledge?Code1
VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition ChallengeCode1
Probabilistic Back-ends for Online Speaker Recognition and ClusteringCode1
TAPLoss: A Temporal Acoustic Parameter Loss for Speech EnhancementCode1
OLKAVS: An Open Large-Scale Korean Audio-Visual Speech DatasetCode1
Speaker recognition with two-step multi-modal deep cleansingCode1
Toroidal Probabilistic Spherical Discriminant AnalysisCode1
Towards Understanding and Mitigating Audio Adversarial Examples for Speaker RecognitionCode1
Merkel Podcast Corpus: A Multimodal Dataset Compiled from 16 Years of Angela Merkel’s Weekly Video PodcastsCode1
Merkel Podcast Corpus: A Multimodal Dataset Compiled from 16 Years of Angela Merkel's Weekly Video PodcastsCode1
Speaker Recognition in the WildCode1
Training speaker recognition systems with limited dataCode1
Probabilistic Spherical Discriminant Analysis: An Alternative to PLDA for length-normalized embeddingsCode1
Bias in Automated Speaker RecognitionCode1
HLT-NUS SUBMISSION FOR 2020 NIST Conversational Telephone Speech SRECode1
Self-supervised Speaker Recognition with Loss-gated LearningCode1
Temporal Dynamic Convolutional Neural Network for Text-Independent Speaker Verification and Phonemetic AnalysisCode1
Fine-tuning wav2vec2 for speaker recognitionCode1
VoxCeleb Enrichment for Age and Gender RecognitionCode1
SpeechNAS: Towards Better Trade-off between Latency and Accuracy for Large-Scale Speaker VerificationCode1
SEC4SR: A Security Analysis Platform for Speaker RecognitionCode1
Exploring Deep Learning for Joint Audio-Visual Lip BiometricsCode1
Speaker embeddings by modeling channel-wise correlationsCode1
EfficientTDNN: Efficient Architecture Search for Speaker RecognitionCode1
Deep Discriminative Feature Learning for Accent RecognitionCode1
Speaker anonymisation using the McAdams coefficientCode1
Leveraging speaker attribute information using multi task learning for speaker verification and diarizationCode1
Unsupervised Representation Learning for Speaker Recognition via Contrastive Equilibrium LearningCode1
Adversarial Attack and Defense Strategies for Deep Speaker Recognition SystemsCode1
Neural PLDA Modeling for End-to-End Speaker VerificationCode1
TERA: Self-Supervised Learning of Transformer Encoder Representation for SpeechCode1
Crossed-Time Delay Neural Network for Speaker RecognitionCode1
AutoSpeech: Neural Architecture Search for Speaker RecognitionCode1
Universal Adversarial Perturbations Generative Network for Speaker RecognitionCode1
Meta-Learning for Short Utterance Speaker Recognition with Imbalance Length PairsCode1
AM-MobileNet1D: A Portable Model for Speaker RecognitionCode1
Speech2Phone: A Novel and Efficient Method for Training Speaker Recognition ModelsCode1
NPLDA: A Deep Neural PLDA Model for Speaker VerificationCode1
BERTphone: Phonetically-Aware Encoder Representations for Utterance-Level Speaker and Language RecognitionCode1
Show:102550
← PrevPage 1 of 9Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1w2v2-aamEER1.88Unverified
2WavLM+ECAPA-TDNNEER0.39Unverified