| When are Lemons Purple? The Concept Association Bias of Vision-Language Models | Dec 22, 2022 | Attributeimage-classification | —Unverified | 0 | 0 |
| Accuracy vs. Complexity: A Trade-off in Visual Question Answering Models | Jan 20, 2020 | Question AnsweringVisual Question Answering | —Unverified | 0 | 0 |
| Exploring the Frontier of Vision-Language Models: A Survey of Current Methodologies and Future Directions | Feb 20, 2024 | Image CaptioningQuestion Answering | —Unverified | 0 | 0 |
| An Empirical Study on Leveraging Scene Graphs for Visual Question Answering | Jul 28, 2019 | Knowledge GraphsQuestion Answering | —Unverified | 0 | 0 |
| LRRA:A Transparent Neural-Symbolic Reasoning Framework for Real-World Visual Question Answering | Aug 1, 2021 | Question AnsweringVisual Question Answering | —Unverified | 0 | 0 |
| Exploring the Effectiveness of Object-Centric Representations in Visual Question Answering: Comparative Insights with Foundation Models | Jul 22, 2024 | Question AnsweringRepresentation Learning | —Unverified | 0 | 0 |
| Can You Explain That? Lucid Explanations Help Human-AI Collaborative Image Retrieval | Apr 5, 2019 | Image RetrievalQuestion Answering | —Unverified | 0 | 0 |
| LVLM_CSP: Accelerating Large Vision Language Models via Clustering, Scattering, and Pruning for Reasoning Segmentation | Apr 15, 2025 | Image CaptioningQuestion Answering | —Unverified | 0 | 0 |
| Exploring Spatial Language Grounding Through Referring Expressions | Feb 4, 2025 | Image CaptioningNegation | —Unverified | 0 | 0 |
| Exploring Sparse Spatial Relation in Graph Inference for Text-Based VQA | Oct 13, 2023 | Graph LearningObject | —Unverified | 0 | 0 |
| An Empirical Study of Batch Normalization and Group Normalization in Conditional Computation | Jul 31, 2019 | Conditional Image GenerationFew-Shot Learning | —Unverified | 0 | 0 |
| Exploring Question Decomposition for Zero-Shot VQA | Oct 25, 2023 | Question AnsweringVisual Question Answering | —Unverified | 0 | 0 |
| Exploring Human-like Attention Supervision in Visual Question Answering | Sep 19, 2017 | Question AnsweringVisual Question Answering | —Unverified | 0 | 0 |
| M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding | Nov 7, 2024 | document understandingOptical Character Recognition | —Unverified | 0 | 0 |
| M4CXR: Exploring Multi-task Potentials of Multi-modal Large Language Models for Chest X-ray Interpretation | Aug 29, 2024 | Instruction FollowingMedical Report Generation | —Unverified | 0 | 0 |
| MagiC: Evaluating Multimodal Cognition Toward Grounded Visual Reasoning | Jul 9, 2025 | DiagnosticMultimodal Reasoning | —Unverified | 0 | 0 |
| MAGIC-VQA: Multimodal And Grounded Inference with Commonsense Knowledge for Visual Question Answering | Mar 24, 2025 | Graph Neural NetworkQuestion Answering | —Unverified | 0 | 0 |
| Exploring Diverse Methods in Visual Question Answering | Apr 21, 2024 | Question AnsweringVisual Question Answering | —Unverified | 0 | 0 |
| Exploring Advanced Techniques for Visual Question Answering: A Comprehensive Comparison | Feb 20, 2025 | DiversityLanguage Modeling | —Unverified | 0 | 0 |
| Making the Most of What You Have: Adapting Pre-trained Visual Language Models in the Low-data Regime | May 3, 2023 | Image CaptioningQuestion Answering | —Unverified | 0 | 0 |
| An Empirical Evaluation of Visual Question Answering for Novel Objects | Apr 8, 2017 | Question AnsweringVisual Question Answering | —Unverified | 0 | 0 |
| Explore the Hallucination on Low-level Perception for MLLMs | Sep 15, 2024 | HallucinationQuestion Answering | —Unverified | 0 | 0 |
| Video Question Answering for People with Visual Impairments Using an Egocentric 360-Degree Camera | May 30, 2024 | Question AnsweringVideo Question Answering | —Unverified | 0 | 0 |
| MAMO: Masked Multimodal Modeling for Fine-Grained Vision-Language Representation Learning | Oct 9, 2022 | Image-text Retrievalmultimodal interaction | —Unverified | 0 | 0 |
| Explicit Reasoning over End-to-End Neural Architectures for Visual Question Answering | Mar 23, 2018 | Question AnsweringVisual Question Answering | —Unverified | 0 | 0 |