| Learning Audio-Visual Dereverberation | Jun 14, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| Blind Speech Separation and Dereverberation using Neural Beamforming | Mar 24, 2021 | Speaker IdentificationSpeaker Separation | CodeCode Available | 1 |
| Extended U-Net for Speaker Verification in Noisy Environments | Jun 27, 2022 | DenoisingSpeaker Identification | CodeCode Available | 1 |
| FastAudio: A Learnable Audio Front-End for Spoof Speech Detection | Sep 6, 2021 | Speaker IdentificationSpeaker Verification | CodeCode Available | 1 |
| GIFT: Graph-Induced Fine-Tuning for Multi-Party Conversation Understanding | May 16, 2023 | Speaker Identification | CodeCode Available | 1 |
| Improving speaker discrimination of target speech extraction with time-domain SpeakerBeam | Jan 23, 2020 | Speaker IdentificationSpeech Extraction | CodeCode Available | 1 |
| ComiCap: A VLMs pipeline for dense captioning of Comic Panels | Sep 24, 2024 | AttributeDense Captioning | CodeCode Available | 1 |
| CoMix: A Comprehensive Benchmark for Multi-Task Comic Understanding | Jul 4, 2024 | Dialogue Generationobject-detection | CodeCode Available | 1 |
| Learning Speaker Representations with Mutual Information | Dec 1, 2018 | SentenceSpeaker Identification | CodeCode Available | 1 |
| Non-uniform Speaker Disentanglement For Depression Detection From Raw Speech Signals | Jun 2, 2023 | Depression DetectionDisentanglement | CodeCode Available | 1 |