| RealCQA: Scientific Chart Question Answering as a Test-bed for First-Order Logic | Aug 3, 2023 | Chart Question AnsweringFormal Logic | CodeCode Available | 0 |
| Kvasir-VQA: A Text-Image Pair GI Tract Dataset | Sep 2, 2024 | Image CaptioningImage Generation | CodeCode Available | 0 |
| A Neuro-Symbolic ASP Pipeline for Visual Question Answering | May 16, 2022 | Question AnsweringVisual Question Answering | CodeCode Available | 0 |
| KOFFVQA: An Objectively Evaluated Free-form VQA Benchmark for Large Vision-Language Models in the Korean Language | Mar 31, 2025 | FormQuestion Answering | CodeCode Available | 0 |
| Knowledge Generation for Zero-shot Knowledge-based VQA | Feb 4, 2024 | Question AnsweringVisual Question Answering | CodeCode Available | 0 |
| Toward Multi-Granularity Decision-Making: Explicit Visual Reasoning with Hierarchical Knowledge | Jan 1, 2023 | Decision MakingQuestion Answering | CodeCode Available | 0 |
| Towards Addressing the Misalignment of Object Proposal Evaluation for Vision-Language Tasks via Semantic Grounding | Sep 1, 2023 | Graph GenerationImage Captioning | CodeCode Available | 0 |
| Dual Attention Networks for Multimodal Reasoning and Matching | Nov 2, 2016 | Collaborative InferenceImage-text matching | CodeCode Available | 0 |
| Recommending Themes for Ad Creative Design via Visual-Linguistic Representations | Jan 20, 2020 | Question AnsweringRecommendation Systems | CodeCode Available | 0 |
| DrishtiKon: Multi-Granular Visual Grounding for Text-Rich Document Images | Jun 26, 2025 | document understandingOptical Character Recognition (OCR) | CodeCode Available | 0 |
| Recursive Visual Attention in Visual Dialog | Dec 6, 2018 | Question AnsweringVisual Dialog | CodeCode Available | 0 |
| Knowledge Acquisition Disentanglement for Knowledge-based Visual Question Answering with Large Language Models | Jul 22, 2024 | DisentanglementQuestion Answering | CodeCode Available | 0 |
| ReDiT: Re‑evaluating large visual question answering model confidence by defining input scenario Difficulty and applying Temperature mapping | Jan 6, 2025 | Question AnsweringVisual Question Answering | CodeCode Available | 0 |
| Towards a performance analysis on pre-trained Visual Question Answering models for autonomous driving | Jul 18, 2023 | Autonomous DrivingModel Selection | CodeCode Available | 0 |
| Cascaded Mutual Modulation for Visual Reasoning | Sep 6, 2018 | Question AnsweringVisual Question Answering | CodeCode Available | 0 |
| Knowing Earlier what Right Means to You: A Comprehensive VQA Dataset for Grounding Relative Directions via Multi-Task Learning | Jul 6, 2022 | DiagnosticMulti-Task Learning | CodeCode Available | 0 |
| Reframing Spatial Reasoning Evaluation in Language Models: A Real-World Simulation Benchmark for Qualitative Reasoning | May 23, 2024 | Logical Reasoning Question AnsweringSpatial Reasoning | CodeCode Available | 0 |
| Towards a Unified Multimodal Reasoning Framework | Dec 22, 2023 | Multimodal ReasoningMultiple-choice | CodeCode Available | 0 |
| Relation-Aware Graph Attention Network for Visual Question Answering | Mar 29, 2019 | Graph AttentionImplicit Relations | CodeCode Available | 0 |
| 'Just because you are right, doesn't mean I am wrong': Overcoming a Bottleneck in the Development and Evaluation of Open-Ended Visual Question Answering (VQA) Tasks | Mar 28, 2021 | Question AnsweringVisual Question Answering | CodeCode Available | 0 |
| Adaptive loose optimization for robust question answering | May 6, 2023 | Extractive Question-AnsweringMachine Reading Comprehension | CodeCode Available | 0 |
| REMIND Your Neural Network to Prevent Catastrophic Forgetting | Oct 6, 2019 | QuantizationQuestion Answering | CodeCode Available | 0 |
| Bridging Vision and Language Spaces with Assignment Prediction | Apr 15, 2024 | Cross-Modal RetrievalImage Captioning | CodeCode Available | 0 |
| Joint Visual and Text Prompting for Improved Object-Centric Perception with Multimodal Large Language Models | Apr 6, 2024 | MMEObject | CodeCode Available | 0 |
| Joint Answering and Explanation for Visual Commonsense Reasoning | Feb 25, 2022 | Knowledge DistillationQuestion Answering | CodeCode Available | 0 |