| Generalization Differences between End-to-End and Neuro-Symbolic Vision-Language Reasoning Systems | Oct 26, 2022 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| MF2-MVQA: A Multi-stage Feature Fusion method for Medical Visual Question Answering | Nov 11, 2022 | Medical Visual Question AnsweringQuestion Answering | —Unverified | 0 |
| Compact Tensor Pooling for Visual Question Answering | Jun 20, 2017 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Gender and Racial Bias in Visual Question Answering Datasets | May 17, 2022 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Measuring CLEVRness: Black-box Testing of Visual Reasoning Models | Sep 29, 2021 | BenchmarkingDiagnostic | —Unverified | 0 |
| Gemini Pro Defeated by GPT-4V: Evidence from Education | Dec 27, 2023 | image-classificationImage Classification | —Unverified | 0 |
| Measuring CLEVRness: Blackbox testing of Visual Reasoning Models | Feb 24, 2022 | BenchmarkingDiagnostic | —Unverified | 0 |
| GEMeX-ThinkVG: Towards Thinking with Visual Grounding in Medical VQA via Reinforcement Learning | Jun 22, 2025 | Answer GenerationDecision Making | —Unverified | 0 |
| GEMeX: A Large-Scale, Groundable, and Explainable Medical VQA Benchmark for Chest X-ray Diagnosis | Nov 25, 2024 | Medical Visual Question AnsweringMultiple-choice | —Unverified | 0 |
| A Thousand Words Are Worth More Than a Picture: Natural Language-Centric Outside-Knowledge Visual Question Answering | Jan 14, 2022 | Generative Question AnsweringImage to text | —Unverified | 0 |
| Measuring Machine Intelligence Through Visual Question Answering | Aug 31, 2016 | Image CaptioningQuestion Answering | —Unverified | 0 |
| GC-KBVQA: A New Four-Stage Framework for Enhancing Knowledge Based Visual Question Answering Performance | May 25, 2025 | Caption GenerationQuestion Answering | —Unverified | 0 |
| Combining Knowledge Graph and LLMs for Enhanced Zero-shot Visual Question Answering | Jan 22, 2025 | Knowledge GraphsQuestion Answering | —Unverified | 0 |
| Gamified crowd-sourcing of high-quality data for visual fine-tuning | Oct 5, 2024 | Visual Question Answering | —Unverified | 0 |
| All You May Need for VQA are Image Captions | Jan 16, 2022 | AllImage Captioning | —Unverified | 0 |
| AdaDARE-gamma: Balancing Stability and Plasticity in Multi-modal LLMs through Efficient Adaptation | Jan 1, 2025 | Image CaptioningQuestion Answering | —Unverified | 0 |
| FVQA: Fact-based Visual Question Answering | Jun 17, 2016 | Common Sense ReasoningQuestion Answering | —Unverified | 0 |
| FVQA 2.0: Introducing Adversarial Samples into Fact-based Visual Question Answering | Mar 19, 2023 | Common Sense ReasoningInformation Retrieval | —Unverified | 0 |
| Fusion of Domain-Adapted Vision and Language Models for Medical Visual Question Answering | Apr 24, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Fusion of Detected Objects in Text for Visual Question Answering | Aug 14, 2019 | Question AnsweringVisual Commonsense Reasoning | —Unverified | 0 |
| COIN: Counterfactual Image Generation for VQA Interpretation | Jan 10, 2022 | counterfactualImage Generation | —Unverified | 0 |
| A survey on VQA_Datasets and Approaches | May 2, 2021 | Question AnsweringSurvey | —Unverified | 0 |
| Med-2E3: A 2D-Enhanced 3D Medical Multimodal Large Language Model | Nov 19, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| FunBench: Benchmarking Fundus Reading Skills of MLLMs | Mar 2, 2025 | AnatomyBenchmarking | —Unverified | 0 |
| AdaCoder: Adaptive Prompt Compression for Programmatic Visual Question Answering | Jul 28, 2024 | Question AnsweringVisual Question Answering | —Unverified | 0 |