| GIFT: Graph-Induced Fine-Tuning for Multi-Party Conversation Understanding | May 16, 2023 | Speaker Identification | CodeCode Available | 1 |
| Improving speaker discrimination of target speech extraction with time-domain SpeakerBeam | Jan 23, 2020 | Speaker IdentificationSpeech Extraction | CodeCode Available | 1 |
| AM-MobileNet1D: A Portable Model for Speaker Recognition | Mar 31, 2020 | Deep Learningmodel | CodeCode Available | 1 |
| Learning Audio-Visual Dereverberation | Jun 14, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| ATST: Audio Representation Learning with Teacher-Student Transformer | Apr 26, 2022 | Audio ClassificationInstrument Recognition | CodeCode Available | 1 |
| A Modulation-Domain Loss for Neural-Network-based Real-time Speech Enhancement | Feb 15, 2021 | Speaker IdentificationSpeech Denoising | CodeCode Available | 1 |
| ComiCap: A VLMs pipeline for dense captioning of Comic Panels | Sep 24, 2024 | AttributeDense Captioning | CodeCode Available | 1 |
| Blind Speech Separation and Dereverberation using Neural Beamforming | Mar 24, 2021 | Speaker IdentificationSpeaker Separation | CodeCode Available | 1 |
| Minimum Bayes Risk Training for End-to-End Speaker-Attributed ASR | Nov 3, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 |
| AutoSpeech: Neural Architecture Search for Speaker Recognition | May 7, 2020 | image-classificationImage Classification | CodeCode Available | 1 |