| PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit | May 20, 2022 | AllAutomatic Speech Recognition (ASR) | CodeCode Available | 6 |
| VoxBlink2: A 100K+ Speaker Recognition Corpus and the Open-Set Speaker-Identification Benchmark | Jul 16, 2024 | DiversitySpeaker Identification | CodeCode Available | 5 |
| ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models | Jan 30, 2024 | Self-Supervised LearningSpeaker Recognition | CodeCode Available | 3 |
| Take the aTrain. Introducing an Interface for the Accessible Transcription of Interviews | Oct 18, 2023 | CPUGPU | CodeCode Available | 3 |
| Leveraging In-the-Wild Data for Effective Self-Supervised Pretraining in Speaker Recognition | Sep 21, 2023 | Speaker Recognition | CodeCode Available | 3 |
| Pushing the limits of raw waveform speaker recognition | Mar 16, 2022 | Self-Supervised LearningSpeaker Recognition | CodeCode Available | 3 |
| SEED: Speaker Embedding Enhancement Diffusion Model | May 22, 2025 | modelSpeaker Recognition | CodeCode Available | 2 |
| Reshape Dimensions Network for Speaker Recognition | Jul 25, 2024 | Speaker Recognition | CodeCode Available | 2 |
| VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking | Oct 11, 2018 | Speaker RecognitionSpeaker Separation | CodeCode Available | 2 |
| USEF-TSE: Universal Speaker Embedding Free Target Speaker Extraction | Sep 4, 2024 | Speaker RecognitionSpeech Separation | CodeCode Available | 1 |
| VoxSim: A perceptual voice similarity dataset | Jul 26, 2024 | BenchmarkingSpeaker Recognition | CodeCode Available | 1 |
| SpeechGLUE: How Well Can Self-Supervised Speech Models Capture Linguistic Knowledge? | Jun 14, 2023 | Natural Language UnderstandingSelf-Supervised Learning | CodeCode Available | 1 |
| VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge | Feb 20, 2023 | Speaker DiarizationSpeaker Recognition | CodeCode Available | 1 |
| Probabilistic Back-ends for Online Speaker Recognition and Clustering | Feb 19, 2023 | ClusteringOnline Clustering | CodeCode Available | 1 |
| TAPLoss: A Temporal Acoustic Parameter Loss for Speech Enhancement | Feb 16, 2023 | Speaker RecognitionSpeech Enhancement | CodeCode Available | 1 |
| OLKAVS: An Open Large-Scale Korean Audio-Visual Speech Dataset | Jan 16, 2023 | Audio-Visual Speech RecognitionLip Reading | CodeCode Available | 1 |
| Speaker recognition with two-step multi-modal deep cleansing | Oct 28, 2022 | Representation LearningSpeaker Recognition | CodeCode Available | 1 |
| Toroidal Probabilistic Spherical Discriminant Analysis | Oct 27, 2022 | FormSpeaker Recognition | CodeCode Available | 1 |
| Towards Understanding and Mitigating Audio Adversarial Examples for Speaker Recognition | Jun 7, 2022 | Speaker Recognitionspeech-recognition | CodeCode Available | 1 |
| Merkel Podcast Corpus: A Multimodal Dataset Compiled from 16 Years of Angela Merkel’s Weekly Video Podcasts | Jun 1, 2022 | Face DetectionFace Generation | CodeCode Available | 1 |
| Merkel Podcast Corpus: A Multimodal Dataset Compiled from 16 Years of Angela Merkel's Weekly Video Podcasts | May 24, 2022 | Face DetectionFace Generation | CodeCode Available | 1 |
| Speaker Recognition in the Wild | May 5, 2022 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Probabilistic Spherical Discriminant Analysis: An Alternative to PLDA for length-normalized embeddings | Mar 28, 2022 | Speaker Recognition | CodeCode Available | 1 |
| Training speaker recognition systems with limited data | Mar 28, 2022 | Speaker Recognition | CodeCode Available | 1 |
| Bias in Automated Speaker Recognition | Jan 24, 2022 | BIG-bench Machine LearningFace Recognition | CodeCode Available | 1 |