| Visual Question Answering in the Medical Domain | Sep 20, 2023 | Contrastive LearningMedical Visual Question Answering | —Unverified | 0 |
| Visual Question Answering on 360° Images | Jan 10, 2020 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Visual Question Answering on Image Sets | Aug 27, 2020 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Visual Question Answering on Multiple Remote Sensing Image Modalities | May 21, 2025 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Visual Question Answering Using Semantic Information from Image Descriptions | Apr 23, 2020 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Visual Question Answering (VQA) on Images with Superimposed Text | Jun 13, 2023 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Visual Question Answering with Memory-Augmented Networks | Jul 17, 2017 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Visual Question Answering with Prior Class Semantics | May 4, 2020 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Visual Question Answering with Question Representation Update (QRU) | Dec 1, 2016 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Visual Question Generation as Dual Task of Visual Question Answering | Sep 21, 2017 | Question AnsweringQuestion Generation | —Unverified | 0 |
| Visual Question: Predicting If a Crowd Will Agree on the Answer | Aug 29, 2016 | Question Answeringvalid | —Unverified | 0 |
| Visual Question Reasoning on General Dependency Tree | Mar 31, 2018 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Visual Reference Resolution using Attention Memory for Visual Dialog | Sep 23, 2017 | Parameter PredictionQuestion Answering | —Unverified | 0 |
| Visual Relationship Detection using Scene Graphs: A Survey | May 16, 2020 | Graph GenerationImage Generation | —Unverified | 0 |
| Visual Superordinate Abstraction for Robust Concept Learning | May 28, 2022 | AttributeQuestion Answering | —Unverified | 0 |
| Visual TTR - Modelling Visual Question Answering in Type Theory with Records | May 1, 2019 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| ViT3D Alignment of LLaMA3: 3D Medical Image Report Generation | Oct 11, 2024 | DiagnosticLanguage Modeling | —Unverified | 0 |
| ViUniT: Visual Unit Tests for More Robust Visual Programming | Dec 12, 2024 | Image GenerationImage-text matching | —Unverified | 0 |
| VL-BEiT: Generative Vision-Language Pretraining | Jun 2, 2022 | image-classificationImage Classification | —Unverified | 0 |
| VLFeedback: A Large-Scale AI Feedback Dataset for Large Vision-Language Models Alignment | Oct 12, 2024 | DiversityHallucination | —Unverified | 0 |
| VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks | Oct 7, 2024 | Information RetrievalLanguage Modeling | —Unverified | 0 |
| VLMAE: Vision-Language Masked Autoencoder | Aug 19, 2022 | Image-text RetrievalLanguage Modeling | —Unverified | 0 |
| VL-Mamba: Exploring State Space Models for Multimodal Learning | Mar 20, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| VLM-Assisted Continual learning for Visual Question Answering in Self-Driving | Feb 2, 2025 | Autonomous DrivingContinual Learning | —Unverified | 0 |
| VLR-Bench: Multilingual Benchmark Dataset for Vision-Language Retrieval Augmented Generation | Dec 13, 2024 | Instruction FollowingQuestion Answering | —Unverified | 0 |
| EVJVQA Challenge: Multilingual Visual Question Answering | Feb 23, 2023 | Language ModelingLanguage Modelling | —Unverified | 0 |
| VolDoGer: LLM-assisted Datasets for Domain Generalization in Vision-Language Tasks | Jul 29, 2024 | Deep LearningDomain Generalization | —Unverified | 0 |
| VQA-Aid: Visual Question Answering for Post-Disaster Damage Assessment and Analysis | Jun 19, 2021 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| VQABQ: Visual Question Answering by Basic Questions | Mar 19, 2017 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| VQA-Diff: Exploiting VQA and Diffusion for Zero-Shot Image-to-3D Vehicle Asset Generation in Autonomous Driving | Jul 9, 2024 | Autonomous DrivingImage to 3D | —Unverified | 0 |
| VQA-E: Explaining, Elaborating, and Enhancing Your Answers for Visual Questions | Mar 20, 2018 | Explanatory Visual Question AnsweringMulti-Task Learning | —Unverified | 0 |
| VQA-GEN: A Visual Question Answering Benchmark for Domain Generalization | Nov 1, 2023 | Domain GeneralizationQuestion Answering | —Unverified | 0 |
| VQA-GNN: Reasoning with Multimodal Knowledge via Graph Neural Networks for Visual Question Answering | May 23, 2022 | Knowledge GraphsQuestion Answering | —Unverified | 0 |
| VQA-LOL: Visual Question Answering under the Lens of Logic | Feb 19, 2020 | NegationQuestion Answering | —Unverified | 0 |
| VQA-MHUG: A Gaze Dataset to Study Multimodal Neural Attention in Visual Question Answering | Sep 27, 2021 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| VQA Training Sets are Self-play Environments for Generating Few-shot Pools | May 30, 2024 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| VQAttack: Transferable Adversarial Attacks on Visual Question Answering via Pre-trained Models | Feb 16, 2024 | Adversarial RobustnessLanguage Modelling | —Unverified | 0 |
| VQA with Cascade of Self- and Co-Attention Blocks | Feb 28, 2023 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| VSA4VQA: Scaling a Vector Symbolic Architecture to Visual Question Answering on Natural Images | May 6, 2024 | AttributeLanguage Modeling | —Unverified | 0 |
| WangLab at MEDIQA-M3G 2024: Multimodal Medical Answer Generation using Large Language Models | Apr 22, 2024 | Answer Generationimage-classification | —Unverified | 0 |
| Weak Supervision helps Emergence of Word-Object Alignment and improves Vision-Language Tasks | Dec 6, 2019 | Image RetrievalInductive Bias | —Unverified | 0 |
| What If We Recaption Billions of Web Images with LLaMA-3? | Jun 12, 2024 | Cross-Modal RetrievalImage Generation | —Unverified | 0 |
| What is needed for simple spatial language capabilities in VQA? | Aug 17, 2019 | DiagnosticQuestion Answering | —Unverified | 0 |
| What Large Language Models Bring to Text-rich VQA? | Nov 13, 2023 | Image ComprehensionOptical Character Recognition (OCR) | —Unverified | 0 |
| When are Lemons Purple? The Concept Association Bias of Vision-Language Models | Dec 22, 2022 | Attributeimage-classification | —Unverified | 0 |
| Where is this coming from? Making groundedness count in the evaluation of Document VQA models | Mar 24, 2025 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Where To Look: Focus Regions for Visual Question Answering | Nov 23, 2015 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Which Client is Reliable?: A Reliable and Personalized Prompt-based Federated Learning for Medical Image Question Answering | Oct 23, 2024 | Federated LearningMedical Visual Question Answering | —Unverified | 0 |
| Why context matters in VQA and Reasoning: Semantic interventions for VLM input modalities | Oct 2, 2024 | Question AnsweringVisual Question Answering | —Unverified | 0 |
| Why Does a Visual Question Have Different Answers? | Aug 12, 2019 | Question AnsweringVisual Question Answering | —Unverified | 0 |