| Compositional embedding models for speaker identification and diarization with simultaneous speech from 2+ speakers | Oct 22, 2020 | speaker-diarizationSpeaker Diarization | CodeCode Available | 0 |
| PL-EESR: Perceptual Loss Based END-TO-END Robust Speaker Representation Extraction | Oct 3, 2021 | Speaker IdentificationSpeaker Verification | CodeCode Available | 0 |
| Compositional Clustering: Applications to Multi-Label Object Recognition and Speaker Identification | Sep 9, 2021 | ClusteringFew-Shot Learning | CodeCode Available | 0 |
| Friends-MMC: A Dataset for Multi-modal Multi-party Conversation Understanding | Dec 23, 2024 | Speaker Identification | CodeCode Available | 0 |
| An Effective Transformer-based Contextual Model and Temporal Gate Pooling for Speaker Identification | Aug 22, 2023 | Self-Supervised LearningSpeaker Identification | CodeCode Available | 0 |
| Deep Speaker: an End-to-End Neural Speaker Embedding System | May 5, 2017 | ClusteringSpeaker Identification | CodeCode Available | 0 |
| A Generative Product-of-Filters Model of Audio | Dec 20, 2013 | modelSpeaker Identification | CodeCode Available | 0 |
| Unsupervised Speech Representation Pooling Using Vector Quantization | Apr 8, 2023 | Emotion Recognitionintent-classification | CodeCode Available | 0 |