| Few-Shot Multimodal Explanation for Visual Question Answering | Oct 28, 2024 | Explainable artificial intelligenceExplainable Artificial Intelligence (XAI) | CodeCode Available | 0 |
| Music's Multimodal Complexity in AVQA: Why We Need More than General Multimodal LLMs | May 27, 2025 | Audio-visual Question AnsweringQuestion Answering | CodeCode Available | 0 |
| Zero-shot Commonsense Reasoning over Machine Imagination | Oct 12, 2024 | Question AnsweringVisual Question Answering | CodeCode Available | 0 |
| MUREL: Multimodal Relational Reasoning for Visual Question Answering | Feb 25, 2019 | Relational ReasoningVisual Question Answering | CodeCode Available | 0 |
| Multi-Sourced Compositional Generalization in Visual Question Answering | May 29, 2025 | Question AnsweringVisual Question Answering | CodeCode Available | 0 |
| Multiple interaction learning with question-type prior knowledge for constraining answer search space in visual question answering | Sep 23, 2020 | Question AnsweringVisual Question Answering | CodeCode Available | 0 |
| Object Attribute Matters in Visual Question Answering | Dec 20, 2023 | AttributeGraph Neural Network | CodeCode Available | 0 |
| Object-aware Adaptive-Positivity Learning for Audio-Visual Question Answering | Dec 20, 2023 | Audio-visual Question AnsweringAudio-Visual Question Answering (AVQA) | CodeCode Available | 0 |
| What is Right for Me is Not Yet Right for You: A Dataset for Grounding Relative Directions via Multi-Task Learning | May 5, 2022 | Multi-Task LearningQuestion Answering | CodeCode Available | 0 |
| Visual Robustness Benchmark for Visual Question Answering (VQA) | Jul 3, 2024 | Visual Question AnsweringVisual Question Answering (VQA) | CodeCode Available | 0 |