| PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit | May 20, 2022 | AllAutomatic Speech Recognition (ASR) | CodeCode Available | 6 |
| VoxBlink2: A 100K+ Speaker Recognition Corpus and the Open-Set Speaker-Identification Benchmark | Jul 16, 2024 | DiversitySpeaker Identification | CodeCode Available | 5 |
| ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models | Jan 30, 2024 | Self-Supervised LearningSpeaker Recognition | CodeCode Available | 3 |
| Take the aTrain. Introducing an Interface for the Accessible Transcription of Interviews | Oct 18, 2023 | CPUGPU | CodeCode Available | 3 |
| Leveraging In-the-Wild Data for Effective Self-Supervised Pretraining in Speaker Recognition | Sep 21, 2023 | Speaker Recognition | CodeCode Available | 3 |
| Pushing the limits of raw waveform speaker recognition | Mar 16, 2022 | Self-Supervised LearningSpeaker Recognition | CodeCode Available | 3 |
| SEED: Speaker Embedding Enhancement Diffusion Model | May 22, 2025 | modelSpeaker Recognition | CodeCode Available | 2 |
| Reshape Dimensions Network for Speaker Recognition | Jul 25, 2024 | Speaker Recognition | CodeCode Available | 2 |
| VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking | Oct 11, 2018 | Speaker RecognitionSpeaker Separation | CodeCode Available | 2 |
| USEF-TSE: Universal Speaker Embedding Free Target Speaker Extraction | Sep 4, 2024 | Speaker RecognitionSpeech Separation | CodeCode Available | 1 |
| VoxSim: A perceptual voice similarity dataset | Jul 26, 2024 | BenchmarkingSpeaker Recognition | CodeCode Available | 1 |
| SpeechGLUE: How Well Can Self-Supervised Speech Models Capture Linguistic Knowledge? | Jun 14, 2023 | Natural Language UnderstandingSelf-Supervised Learning | CodeCode Available | 1 |
| VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge | Feb 20, 2023 | Speaker DiarizationSpeaker Recognition | CodeCode Available | 1 |
| Probabilistic Back-ends for Online Speaker Recognition and Clustering | Feb 19, 2023 | ClusteringOnline Clustering | CodeCode Available | 1 |
| TAPLoss: A Temporal Acoustic Parameter Loss for Speech Enhancement | Feb 16, 2023 | Speaker RecognitionSpeech Enhancement | CodeCode Available | 1 |
| OLKAVS: An Open Large-Scale Korean Audio-Visual Speech Dataset | Jan 16, 2023 | Audio-Visual Speech RecognitionLip Reading | CodeCode Available | 1 |
| Speaker recognition with two-step multi-modal deep cleansing | Oct 28, 2022 | Representation LearningSpeaker Recognition | CodeCode Available | 1 |
| Toroidal Probabilistic Spherical Discriminant Analysis | Oct 27, 2022 | FormSpeaker Recognition | CodeCode Available | 1 |
| Towards Understanding and Mitigating Audio Adversarial Examples for Speaker Recognition | Jun 7, 2022 | Speaker Recognitionspeech-recognition | CodeCode Available | 1 |
| Merkel Podcast Corpus: A Multimodal Dataset Compiled from 16 Years of Angela Merkel’s Weekly Video Podcasts | Jun 1, 2022 | Face DetectionFace Generation | CodeCode Available | 1 |
| Merkel Podcast Corpus: A Multimodal Dataset Compiled from 16 Years of Angela Merkel's Weekly Video Podcasts | May 24, 2022 | Face DetectionFace Generation | CodeCode Available | 1 |
| Speaker Recognition in the Wild | May 5, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Training speaker recognition systems with limited data | Mar 28, 2022 | Speaker Recognition | CodeCode Available | 1 |
| Probabilistic Spherical Discriminant Analysis: An Alternative to PLDA for length-normalized embeddings | Mar 28, 2022 | Speaker Recognition | CodeCode Available | 1 |
| Bias in Automated Speaker Recognition | Jan 24, 2022 | BIG-bench Machine LearningFace Recognition | CodeCode Available | 1 |
| HLT-NUS SUBMISSION FOR 2020 NIST Conversational Telephone Speech SRE | Nov 12, 2021 | Domain AdaptationSpeaker Recognition | CodeCode Available | 1 |
| Self-supervised Speaker Recognition with Loss-gated Learning | Oct 8, 2021 | Self-Supervised LearningSpeaker Recognition | CodeCode Available | 1 |
| Temporal Dynamic Convolutional Neural Network for Text-Independent Speaker Verification and Phonemetic Analysis | Oct 7, 2021 | Speaker RecognitionSpeaker Verification | CodeCode Available | 1 |
| Fine-tuning wav2vec2 for speaker recognition | Sep 30, 2021 | ClassificationSpeaker Recognition | CodeCode Available | 1 |
| VoxCeleb Enrichment for Age and Gender Recognition | Sep 28, 2021 | Age Estimationregression | CodeCode Available | 1 |
| SpeechNAS: Towards Better Trade-off between Latency and Accuracy for Large-Scale Speaker Verification | Sep 18, 2021 | Neural Architecture SearchSpeaker Recognition | CodeCode Available | 1 |
| SEC4SR: A Security Analysis Platform for Speaker Recognition | Sep 4, 2021 | Speaker Recognition | CodeCode Available | 1 |
| Exploring Deep Learning for Joint Audio-Visual Lip Biometrics | Apr 17, 2021 | Deep LearningSpeaker Recognition | CodeCode Available | 1 |
| Speaker embeddings by modeling channel-wise correlations | Apr 6, 2021 | Speaker RecognitionStyle Transfer | CodeCode Available | 1 |
| EfficientTDNN: Efficient Architecture Search for Speaker Recognition | Mar 25, 2021 | Data AugmentationNetwork Pruning | CodeCode Available | 1 |
| Deep Discriminative Feature Learning for Accent Recognition | Nov 25, 2020 | Face RecognitionSpeaker Identification | CodeCode Available | 1 |
| Speaker anonymisation using the McAdams coefficient | Nov 2, 2020 | Speaker Recognition | CodeCode Available | 1 |
| Leveraging speaker attribute information using multi task learning for speaker verification and diarization | Oct 27, 2020 | AttributeMulti-Task Learning | CodeCode Available | 1 |
| Unsupervised Representation Learning for Speaker Recognition via Contrastive Equilibrium Learning | Oct 22, 2020 | Representation LearningSpeaker Recognition | CodeCode Available | 1 |
| Adversarial Attack and Defense Strategies for Deep Speaker Recognition Systems | Aug 18, 2020 | Adversarial AttackAdversarial Robustness | CodeCode Available | 1 |
| Neural PLDA Modeling for End-to-End Speaker Verification | Aug 11, 2020 | Speaker RecognitionSpeaker Verification | CodeCode Available | 1 |
| TERA: Self-Supervised Learning of Transformer Encoder Representation for Speech | Jul 12, 2020 | Keyword SpottingSelf-Supervised Learning | CodeCode Available | 1 |
| Crossed-Time Delay Neural Network for Speaker Recognition | May 31, 2020 | Speaker RecognitionSpeaker Verification | CodeCode Available | 1 |
| AutoSpeech: Neural Architecture Search for Speaker Recognition | May 7, 2020 | image-classificationImage Classification | CodeCode Available | 1 |
| Universal Adversarial Perturbations Generative Network for Speaker Recognition | Apr 7, 2020 | Speaker Recognition | CodeCode Available | 1 |
| Meta-Learning for Short Utterance Speaker Recognition with Imbalance Length Pairs | Apr 6, 2020 | Meta-LearningSpeaker Identification | CodeCode Available | 1 |
| AM-MobileNet1D: A Portable Model for Speaker Recognition | Mar 31, 2020 | Deep Learningmodel | CodeCode Available | 1 |
| Speech2Phone: A Novel and Efficient Method for Training Speaker Recognition Models | Feb 25, 2020 | Speaker IdentificationSpeaker Recognition | CodeCode Available | 1 |
| NPLDA: A Deep Neural PLDA Model for Speaker Verification | Feb 10, 2020 | Speaker RecognitionSpeaker Verification | CodeCode Available | 1 |
| BERTphone: Phonetically-Aware Encoder Representations for Utterance-Level Speaker and Language Recognition | Jun 30, 2019 | AvgRepresentation Learning | CodeCode Available | 1 |