| A Modulation-Domain Loss for Neural-Network-based Real-time Speech Enhancement | Feb 15, 2021 | Speaker IdentificationSpeech Denoising | CodeCode Available | 1 | 5 |
| Disentangling Textual and Acoustic Features of Neural Speech Representations | Oct 3, 2024 | DisentanglementEmotion Recognition | CodeCode Available | 1 | 5 |
| MelHuBERT: A simplified HuBERT on Mel spectrograms | Nov 17, 2022 | Automatic Speech RecognitionSelf-Supervised Learning | CodeCode Available | 1 | 5 |
| Meta-Learning for Short Utterance Speaker Recognition with Imbalance Length Pairs | Apr 6, 2020 | Meta-LearningSpeaker Identification | CodeCode Available | 1 | 5 |
| AutoSpeech: Neural Architecture Search for Speaker Recognition | May 7, 2020 | image-classificationImage Classification | CodeCode Available | 1 | 5 |
| Non-uniform Speaker Disentanglement For Depression Detection From Raw Speech Signals | Jun 2, 2023 | Depression DetectionDisentanglement | CodeCode Available | 1 | 5 |
| Blind Speech Separation and Dereverberation using Neural Beamforming | Mar 24, 2021 | Speaker IdentificationSpeaker Separation | CodeCode Available | 1 | 5 |
| End-to-End Chinese Speaker Identification | Jul 1, 2022 | coreference-resolutionCoreference Resolution | CodeCode Available | 1 | 5 |
| MPCHAT: Towards Multimodal Persona-Grounded Conversation | May 27, 2023 | Speaker Identification | CodeCode Available | 1 | 5 |
| FastAudio: A Learnable Audio Front-End for Spoof Speech Detection | Sep 6, 2021 | Speaker IdentificationSpeaker Verification | CodeCode Available | 1 | 5 |
| Deep Discriminative Feature Learning for Accent Recognition | Nov 25, 2020 | Face RecognitionSpeaker Identification | CodeCode Available | 1 | 5 |
| ComiCap: A VLMs pipeline for dense captioning of Comic Panels | Sep 24, 2024 | AttributeDense Captioning | CodeCode Available | 1 | 5 |
| CoMix: A Comprehensive Benchmark for Multi-Task Comic Understanding | Jul 4, 2024 | Dialogue Generationobject-detection | CodeCode Available | 1 | 5 |
| FoolHD: Fooling speaker identification by Highly imperceptible adversarial Disturbances | Nov 17, 2020 | Adversarial AttackSpeaker Identification | CodeCode Available | 1 | 5 |
| GIFT: Graph-Induced Fine-Tuning for Multi-Party Conversation Understanding | May 16, 2023 | Speaker Identification | CodeCode Available | 1 | 5 |
| Speech2Phone: A Novel and Efficient Method for Training Speaker Recognition Models | Feb 25, 2020 | Speaker IdentificationSpeaker Recognition | CodeCode Available | 1 | 5 |
| SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing | Oct 14, 2021 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 | 5 |
| Improving speaker discrimination of target speech extraction with time-domain SpeakerBeam | Jan 23, 2020 | Speaker IdentificationSpeech Extraction | CodeCode Available | 1 | 5 |
| Investigation of End-To-End Speaker-Attributed ASR for Continuous Multi-Talker Recordings | Aug 11, 2020 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 1 | 5 |
| An Effective Transformer-based Contextual Model and Temporal Gate Pooling for Speaker Identification | Aug 22, 2023 | Self-Supervised LearningSpeaker Identification | CodeCode Available | 0 | 5 |
| Audio ALBERT: A Lite BERT for Self-supervised Learning of Audio Representation | May 18, 2020 | Self-Supervised LearningSpeaker Identification | CodeCode Available | 0 | 5 |
| Attention-based multi-task learning for speech-enhancement and speaker-identification in multi-speaker dialogue scenario | Jan 7, 2021 | Multi-Task LearningSpeaker Identification | CodeCode Available | 0 | 5 |
| Cross-Lingual Speaker Identification Using Distant Supervision | Oct 11, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 0 | 5 |
| Masked Modeling Duo: Learning Representations by Encouraging Both Networks to Model the Input | Oct 26, 2022 | Audio ClassificationAudio Tagging | CodeCode Available | 0 | 5 |
| Masked Modeling Duo: Towards a Universal Audio Pre-training Framework | Apr 9, 2024 | Audio Classification | CodeCode Available | 0 | 5 |