SOTAVerified

Speaker Recognition

Speaker Recognition is the process of identifying or confirming the identity of a person given his speech segments.

Source: Margin Matters: Towards More Discriminative Deep Neural Network Embeddings for Speaker Recognition

Papers

Showing 125 of 435 papers

TitleStatusHype
PaddleSpeech: An Easy-to-Use All-in-One Speech ToolkitCode6
VoxBlink2: A 100K+ Speaker Recognition Corpus and the Open-Set Speaker-Identification BenchmarkCode5
ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf modelsCode3
Take the aTrain. Introducing an Interface for the Accessible Transcription of InterviewsCode3
Leveraging In-the-Wild Data for Effective Self-Supervised Pretraining in Speaker RecognitionCode3
Pushing the limits of raw waveform speaker recognitionCode3
SEED: Speaker Embedding Enhancement Diffusion ModelCode2
Reshape Dimensions Network for Speaker RecognitionCode2
VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram MaskingCode2
USEF-TSE: Universal Speaker Embedding Free Target Speaker ExtractionCode1
VoxSim: A perceptual voice similarity datasetCode1
SpeechGLUE: How Well Can Self-Supervised Speech Models Capture Linguistic Knowledge?Code1
VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition ChallengeCode1
Probabilistic Back-ends for Online Speaker Recognition and ClusteringCode1
TAPLoss: A Temporal Acoustic Parameter Loss for Speech EnhancementCode1
OLKAVS: An Open Large-Scale Korean Audio-Visual Speech DatasetCode1
Speaker recognition with two-step multi-modal deep cleansingCode1
Toroidal Probabilistic Spherical Discriminant AnalysisCode1
Towards Understanding and Mitigating Audio Adversarial Examples for Speaker RecognitionCode1
Merkel Podcast Corpus: A Multimodal Dataset Compiled from 16 Years of Angela Merkel’s Weekly Video PodcastsCode1
Merkel Podcast Corpus: A Multimodal Dataset Compiled from 16 Years of Angela Merkel's Weekly Video PodcastsCode1
Speaker Recognition in the WildCode1
Probabilistic Spherical Discriminant Analysis: An Alternative to PLDA for length-normalized embeddingsCode1
Training speaker recognition systems with limited dataCode1
Bias in Automated Speaker RecognitionCode1
Show:102550
← PrevPage 1 of 18Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1w2v2-aamEER1.88Unverified
2WavLM+ECAPA-TDNNEER0.39Unverified