| Instance-Level Trojan Attacks on Visual Question Answering via Adversarial Learning in Neuron Activation Space | Apr 2, 2023 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Deep Attention Neural Tensor Network for Visual Question Answering | Sep 1, 2018 | Deep AttentionQuestion Answering | —Unverified | 0 |
| Decoupled Box Proposal and Featurization with Ultrafine-Grained Semantic Labels Improve Image Captioning and Visual Question Answering | Sep 4, 2019 | Image CaptioningObject | —Unverified | 0 |
| Benchmarking Vision Language Models for Cultural Understanding | Jul 15, 2024 | BenchmarkingQuestion Answering | —Unverified | 0 |
| Decouple Before Interact: Multi-Modal Prompt Learning for Continual Visual Question Answering | Jan 1, 2023 | Continual LearningLanguage Modelling | —Unverified | 0 |
| Accuracy vs. Complexity: A Trade-off in Visual Question Answering Models | Jan 20, 2020 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| InfographicVQA | Apr 26, 2021 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| An Empirical Study on the Language Modal in Visual Question Answering | May 17, 2023 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Debating for Better Reasoning: An Unsupervised Multimodal Approach | May 20, 2025 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| An Empirical Study on the Generalization Power of Neural Representations Learned via Visual Guessing Games | Jan 31, 2021 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| DDRprog: A CLEVR Differentiable Dynamic Reasoning Programmer | Mar 30, 2018 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Davidsonian Scene Graph: Improving Reliability in Fine-grained Evaluation for Text-to-Image Generation | Oct 27, 2023 | Image GenerationQuestion Answering | —Unverified | 0 |
| Dataset Bias Mitigation in Multiple-Choice Visual Question Answering and Beyond | Oct 23, 2023 | counterfactualMultiple-choice | —Unverified | 0 |
| Benchmarking Large Multimodal Models for Ophthalmic Visual Question Answering with OphthalWeChat | May 26, 2025 | BenchmarkingQuestion Answering | —Unverified | 0 |
| Accounting for Focus Ambiguity in Visual Questions | Jan 4, 2025 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Data Metabolism: An Efficient Data Design Schema For Vision Language Model | Apr 10, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Data-Driven Calibration of Prediction Sets in Large Vision-Language Models Based on Inductive Conformal Prediction | Apr 24, 2025 | Conformal PredictionHallucination | —Unverified | 0 |
| Data Augmentation for Visual Question Answering | Sep 1, 2017 | Data AugmentationGeneral Classification | —Unverified | 0 |
| DARE: Diverse Visual Question Answering with Robustness Evaluation | Sep 26, 2024 | image-classificationImage Classification | —Unverified | 0 |
| @Bench: Benchmarking Vision-Language Models for Human-centered Assistive Technology | Sep 21, 2024 | BenchmarkingDepth Estimation | —Unverified | 0 |
| Damage Assessment after Natural Disasters with UAVs: Semantic Feature Extraction using Deep Learning | Dec 14, 2024 | Decision MakingQuestion Answering | —Unverified | 0 |
| An Empirical Study on Leveraging Scene Graphs for Visual Question Answering | Jul 28, 2019 | Knowledge GraphsQuestion Answering | —Unverified | 0 |
| Cycle-Consistency for Robust Visual Question Answering | Feb 15, 2019 | Question AnsweringQuestion Generation | —Unverified | 0 |
| Being Negative but Constructively: Lessons Learnt from Creating Better Visual Question Answering Datasets | Apr 24, 2017 | Multiple-choiceQuestion Answering | —Unverified | 0 |
| InfiMM-HD: A Leap Forward in High-Resolution Multimodal Understanding | Mar 3, 2024 | Visual Question Answering | —Unverified | 0 |