| RGBT Tracking via All-layer Multimodal Interactions with Progressive Fusion Mamba | Aug 16, 2024 | AllMamba | —Unverified | 0 |
| Dissecting Dissonance: Benchmarking Large Multimodal Models Against Self-Contradictory Instructions | Aug 2, 2024 | Benchmarkingmultimodal interaction | CodeCode Available | 0 |
| A Unified Understanding of Adversarial Vulnerability Regarding Unimodal Models and Vision-Language Pre-training Models | Jul 25, 2024 | Data Augmentationmultimodal interaction | —Unverified | 0 |
| Dallah: A Dialect-Aware Multimodal Large Language Model for Arabic | Jul 25, 2024 | Image to textLanguage Modeling | —Unverified | 0 |
| Empathic Grounding: Explorations using Multimodal Interaction and Large Language Models with Conversational Agents | Jul 1, 2024 | Emotional IntelligenceEmotion Classification | CodeCode Available | 0 |
| HGNET: A Hierarchical Feature Guided Network for Occupancy Flow Field Prediction | Jul 1, 2024 | Autonomous Drivingmultimodal interaction | —Unverified | 0 |
| A look under the hood of the Interactive Deep Learning Enterprise (No-IDLE) | Jun 27, 2024 | AnatomyDeep Learning | —Unverified | 0 |
| OmniJARVIS: Unified Vision-Language-Action Tokenization Enables Open-World Instruction Following Agents | Jun 27, 2024 | DecoderImitation Learning | —Unverified | 0 |
| EMMI -- Empathic Multimodal Motivational Interviews Dataset: Analyses and Annotations | Jun 24, 2024 | multimodal interaction | —Unverified | 0 |
| Revisiting Multimodal Emotion Recognition in Conversation from the Perspective of Graph Spectrum | Apr 27, 2024 | Contrastive LearningEmotion Recognition | —Unverified | 0 |