| TIP: Tabular-Image Pre-training for Multimodal Classification with Incomplete Data | Jul 10, 2024 | Contrastive Learningmultimodal interaction | CodeCode Available | 2 | 5 |
| Improving Multimodal Named Entity Recognition via Entity Span Detection with Unified Multimodal Transformer | Jul 1, 2020 | multimodal interactionMulti-modal Named Entity Recognition | CodeCode Available | 1 | 5 |
| Generative Multimodal Entity Linking | Jun 22, 2023 | Entity LinkingIn-Context Learning | CodeCode Available | 1 | 5 |
| A Facial Expression-Aware Multimodal Multi-task Learning Framework for Emotion Recognition in Multi-party Conversations | Jul 1, 2023 | Emotion RecognitionEmotion Recognition in Conversation | CodeCode Available | 1 | 5 |
| Investigating and Mitigating the Multimodal Hallucination Snowballing in Large Vision-Language Models | Jun 30, 2024 | Hallucinationmultimodal interaction | CodeCode Available | 1 | 5 |
| MMoE: Enhancing Multimodal Models with Mixtures of Multimodal Interaction Experts | Nov 16, 2023 | Binary ClassificationDescriptive | CodeCode Available | 1 | 5 |
| Mamba-Enhanced Text-Audio-Video Alignment Network for Emotion Recognition in Conversations | Sep 8, 2024 | Emotion RecognitionMamba | CodeCode Available | 1 | 5 |
| Dynamic Modality Interaction Modeling for Image-Text Retrieval | Jul 11, 2021 | cross-modal alignmentCross-Modal Retrieval | CodeCode Available | 1 | 5 |
| Cooperative Sentiment Agents for Multimodal Sentiment Analysis | Apr 19, 2024 | DisentanglementEmotion Recognition | CodeCode Available | 1 | 5 |
| Dialogue-based generation of self-driving simulation scenarios using Large Language Models | Oct 26, 2023 | multimodal interactionSelf-Driving Cars | CodeCode Available | 1 | 5 |