| Learning Speaker Representations with Mutual Information | Dec 1, 2018 | SentenceSpeaker Identification | CodeCode Available | 1 | 5 |
| Minimum Bayes Risk Training for End-to-End Speaker-Attributed ASR | Nov 3, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 | 5 |
| Masked Autoencoders that Listen | Jul 13, 2022 | Audio ClassificationDecoder | CodeCode Available | 1 | 5 |
| MelHuBERT: A simplified HuBERT on Mel spectrograms | Nov 17, 2022 | Automatic Speech RecognitionSelf-Supervised Learning | CodeCode Available | 1 | 5 |
| End-to-End Chinese Speaker Identification | Jul 1, 2022 | coreference-resolutionCoreference Resolution | CodeCode Available | 1 | 5 |
| MPC-BERT: A Pre-Trained Language Model for Multi-Party Conversation Understanding | Jun 3, 2021 | Conversational Response SelectionLanguage Modeling | CodeCode Available | 1 | 5 |
| ComiCap: A VLMs pipeline for dense captioning of Comic Panels | Sep 24, 2024 | AttributeDense Captioning | CodeCode Available | 1 | 5 |
| CoMix: A Comprehensive Benchmark for Multi-Task Comic Understanding | Jul 4, 2024 | Dialogue Generationobject-detection | CodeCode Available | 1 | 5 |
| Extended U-Net for Speaker Verification in Noisy Environments | Jun 27, 2022 | DenoisingSpeaker Identification | CodeCode Available | 1 | 5 |
| Speaker Recognition from Raw Waveform with SincNet | Jul 29, 2018 | Speaker IdentificationSpeaker Recognition | CodeCode Available | 1 | 5 |