| PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit | May 20, 2022 | AllAutomatic Speech Recognition (ASR) | CodeCode Available | 6 | 5 |
| VoxBlink2: A 100K+ Speaker Recognition Corpus and the Open-Set Speaker-Identification Benchmark | Jul 16, 2024 | DiversitySpeaker Identification | CodeCode Available | 5 | 5 |
| ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models | Jan 30, 2024 | Self-Supervised LearningSpeaker Recognition | CodeCode Available | 3 | 5 |
| Take the aTrain. Introducing an Interface for the Accessible Transcription of Interviews | Oct 18, 2023 | CPUGPU | CodeCode Available | 3 | 5 |
| Leveraging In-the-Wild Data for Effective Self-Supervised Pretraining in Speaker Recognition | Sep 21, 2023 | Speaker Recognition | CodeCode Available | 3 | 5 |
| Pushing the limits of raw waveform speaker recognition | Mar 16, 2022 | Self-Supervised LearningSpeaker Recognition | CodeCode Available | 3 | 5 |
| VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking | Oct 11, 2018 | Speaker RecognitionSpeaker Separation | CodeCode Available | 2 | 5 |
| Reshape Dimensions Network for Speaker Recognition | Jul 25, 2024 | Speaker Recognition | CodeCode Available | 2 | 5 |
| SEED: Speaker Embedding Enhancement Diffusion Model | May 22, 2025 | modelSpeaker Recognition | CodeCode Available | 2 | 5 |
| Probabilistic Spherical Discriminant Analysis: An Alternative to PLDA for length-normalized embeddings | Mar 28, 2022 | Speaker Recognition | CodeCode Available | 1 | 5 |
| NPLDA: A Deep Neural PLDA Model for Speaker Verification | Feb 10, 2020 | Speaker RecognitionSpeaker Verification | CodeCode Available | 1 | 5 |
| BERTphone: Phonetically-Aware Encoder Representations for Utterance-Level Speaker and Language Recognition | Jun 30, 2019 | AvgRepresentation Learning | CodeCode Available | 1 | 5 |
| Neural PLDA Modeling for End-to-End Speaker Verification | Aug 11, 2020 | Speaker RecognitionSpeaker Verification | CodeCode Available | 1 | 5 |
| Probabilistic Back-ends for Online Speaker Recognition and Clustering | Feb 19, 2023 | ClusteringOnline Clustering | CodeCode Available | 1 | 5 |
| OLKAVS: An Open Large-Scale Korean Audio-Visual Speech Dataset | Jan 16, 2023 | Audio-Visual Speech RecognitionLip Reading | CodeCode Available | 1 | 5 |
| Merkel Podcast Corpus: A Multimodal Dataset Compiled from 16 Years of Angela Merkel's Weekly Video Podcasts | May 24, 2022 | Face DetectionFace Generation | CodeCode Available | 1 | 5 |
| Frame-level speaker embeddings for text-independent speaker recognition and analysis of end-to-end model | Sep 12, 2018 | Speaker RecognitionText-Independent Speaker Recognition | CodeCode Available | 1 | 5 |
| HLT-NUS SUBMISSION FOR 2020 NIST Conversational Telephone Speech SRE | Nov 12, 2021 | Domain AdaptationSpeaker Recognition | CodeCode Available | 1 | 5 |
| Merkel Podcast Corpus: A Multimodal Dataset Compiled from 16 Years of Angela Merkel’s Weekly Video Podcasts | Jun 1, 2022 | Face DetectionFace Generation | CodeCode Available | 1 | 5 |
| Bias in Automated Speaker Recognition | Jan 24, 2022 | BIG-bench Machine LearningFace Recognition | CodeCode Available | 1 | 5 |
| AutoSpeech: Neural Architecture Search for Speaker Recognition | May 7, 2020 | image-classificationImage Classification | CodeCode Available | 1 | 5 |
| Leveraging speaker attribute information using multi task learning for speaker verification and diarization | Oct 27, 2020 | AttributeMulti-Task Learning | CodeCode Available | 1 | 5 |
| Adversarial Attack and Defense Strategies for Deep Speaker Recognition Systems | Aug 18, 2020 | Adversarial AttackAdversarial Robustness | CodeCode Available | 1 | 5 |
| Crossed-Time Delay Neural Network for Speaker Recognition | May 31, 2020 | Speaker RecognitionSpeaker Verification | CodeCode Available | 1 | 5 |
| Exploring Deep Learning for Joint Audio-Visual Lip Biometrics | Apr 17, 2021 | Deep LearningSpeaker Recognition | CodeCode Available | 1 | 5 |