| Co-VQA : Answering by Interactive Sub Question Sequence | Nov 16, 2021 | Question AnsweringVisual Question Answering | —Unverified | 0 | 0 |
| Probing Inter-modality: Visual Parsing with Self-Attention for Vision-Language Pre-training | Jun 25, 2021 | Image-text RetrievalQuestion Answering | —Unverified | 0 | 0 |
| Probing Inter-modality: Visual Parsing with Self-Attention for Vision-and-Language Pre-training | May 21, 2021 | Question AnsweringRelation | —Unverified | 0 | 0 |
| Probing the Role of Positional Information in Vision-Language Models | Jan 16, 2022 | Contrastive LearningImage-text matching | —Unverified | 0 | 0 |
| Probing the Role of Positional Information in Vision-Language Models | May 17, 2023 | Contrastive LearningImage-text matching | —Unverified | 0 | 0 |
| Probing Visual Language Priors in VLMs | Dec 31, 2024 | Question AnsweringVisual Question Answering | —Unverified | 0 | 0 |
| CoVLM: Composing Visual Entities and Relationships in Large Language Models Via Communicative Decoding | Nov 6, 2023 | CoLAQuestion Answering | —Unverified | 0 | 0 |
| ProcTag: Process Tagging for Assessing the Efficacy of Document Instruction Data | Jul 17, 2024 | Question AnsweringVisual Question Answering | —Unverified | 0 | 0 |
| Program Synthesis Benchmark for Visual Programming in XLogoOnline Environment | Jun 17, 2024 | Logical ReasoningMath | —Unverified | 0 | 0 |
| Counterfactual Vision and Language Learning | Jun 1, 2020 | counterfactualQuestion Answering | —Unverified | 0 | 0 |
| Visual Perturbation-aware Collaborative Learning for Overcoming the Language Prior Problem | Jul 24, 2022 | DiagnosticQuestion Answering | —Unverified | 0 | 0 |
| Consistency and Uncertainty: Identifying Unreliable Responses From Black-Box Vision-Language Models for Selective Visual Question Answering | Apr 16, 2024 | Language ModellingPrediction | —Unverified | 0 | 0 |
| Prompt-Aware Adapter: Towards Learning Adaptive Visual Tokens for Multimodal Large Language Models | May 24, 2024 | Question AnsweringVisual Question Answering | —Unverified | 0 | 0 |
| Prompt-based Personalized Federated Learning for Medical Visual Question Answering | Feb 15, 2024 | Federated LearningMedical Visual Question Answering | —Unverified | 0 | 0 |
| PromptCap: Prompt-Guided Image Captioning for VQA with GPT-3 | Jan 1, 2023 | Image CaptioningQuestion Answering | —Unverified | 0 | 0 |
| Connecting Language and Vision to Actions | Jul 1, 2018 | Image CaptioningLanguage Modeling | —Unverified | 0 | 0 |
| Prompting Large Language Models with Rationale Heuristics for Knowledge-based Visual Question Answering | Dec 22, 2024 | Question AnsweringVisual Question Answering | —Unverified | 0 | 0 |
| Compressing Visual-linguistic Model via Knowledge Distillation | Apr 5, 2021 | Image CaptioningKnowledge Distillation | —Unverified | 0 | 0 |
| Proposal-free One-stage Referring Expression via Grid-Word Cross-Attention | May 5, 2021 | Question AnsweringReferring Expression | —Unverified | 0 | 0 |
| Proposing Plausible Answers for Open-ended Visual Question Answering | Oct 20, 2016 | Graph MatchingOpen-Ended Question Answering | —Unverified | 0 | 0 |
| PropTest: Automatic Property Testing for Improved Visual Programming | Mar 25, 2024 | Question AnsweringReferring Expression | —Unverified | 0 | 0 |
| Compound Tokens: Channel Fusion for Vision-Language Representation Learning | Dec 2, 2022 | DecoderLanguage Modeling | —Unverified | 0 | 0 |
| Provoking Multi-modal Few-Shot LVLM via Exploration-Exploitation In-Context Learning | Jun 11, 2025 | In-Context LearningQuestion Answering | —Unverified | 0 | 0 |
| Psycholinguistics meets Continual Learning: Measuring Catastrophic Forgetting in Visual Question Answering | Jun 10, 2019 | Continual LearningQuestion Answering | —Unverified | 0 | 0 |
| Pushing the Limits of Radiology with Joint Modeling of Visual and Textual Information | Jul 1, 2018 | Image ClassificationMachine Translation | —Unverified | 0 | 0 |
| Pyramid Coder: Hierarchical Code Generator for Compositional Visual Question Answering | Jul 30, 2024 | Code GenerationQuestion Answering | —Unverified | 0 | 0 |
| Q2ATransformer: Improving Medical VQA via an Answer Querying Decoder | Apr 4, 2023 | ClassificationDecoder | —Unverified | 0 | 0 |
| Compositional Memory for Visual Question Answering | Nov 18, 2015 | Question AnsweringVisual Question Answering | —Unverified | 0 | 0 |
| Adversarial VQA: A New Benchmark for Evaluating the Robustness of VQA Models | Jun 1, 2021 | Data AugmentationQuestion Answering | —Unverified | 0 | 0 |
| Compositional Attention Networks for Interpretability in Natural Language Question Answering | Oct 30, 2018 | Logical ReasoningQuestion Answering | —Unverified | 0 | 0 |
| QIRL: Boosting Visual Question Answering via Optimized Question-Image Relation Learning | Apr 4, 2025 | Data AugmentationImage Generation | —Unverified | 0 | 0 |
| Visual Question Answering as a Meta Learning Task | Nov 22, 2017 | Meta-LearningQuestion Answering | —Unverified | 0 | 0 |
| Visual Question Answering as a Multi-Task Problem | Jul 3, 2020 | Question AnsweringVisual Question Answering | —Unverified | 0 | 0 |
| Visual Question Answering as Reading Comprehension | Nov 29, 2018 | Common Sense ReasoningGeneral Knowledge | —Unverified | 0 | 0 |
| Component Analysis for Visual Question Answering Architectures | Feb 12, 2020 | Question AnsweringRepresentation Learning | —Unverified | 0 | 0 |
| Accounting for Focus Ambiguity in Visual Questions | Jan 4, 2025 | Question AnsweringVisual Question Answering | —Unverified | 0 | 0 |
| Visual Question Answering: A Survey on Techniques and Common Trends in Recent Literature | May 18, 2023 | Question AnsweringVisual Question Answering | —Unverified | 0 | 0 |
| Question-Agnostic Attention for Visual Question Answering | Aug 9, 2019 | Question AnsweringVisual Question Answering | —Unverified | 0 | 0 |
| Compact Tensor Pooling for Visual Question Answering | Jun 20, 2017 | Question AnsweringVisual Question Answering | —Unverified | 0 | 0 |
| Combining Knowledge Graph and LLMs for Enhanced Zero-shot Visual Question Answering | Jan 22, 2025 | Knowledge GraphsQuestion Answering | —Unverified | 0 | 0 |
| Question-Conditioned Counterfactual Image Generation for VQA | Nov 14, 2019 | counterfactualImage Generation | —Unverified | 0 | 0 |
| Question-Driven Graph Fusion Network For Visual Question Answering | Apr 3, 2022 | Graph AttentionObject | —Unverified | 0 | 0 |
| Question Generation for Evaluating Cross-Dataset Shifts in Multi-modal Grounding | Jan 24, 2022 | Question AnsweringQuestion Generation | —Unverified | 0 | 0 |
| Question-Guided Hybrid Convolution for Visual Question Answering | Aug 8, 2018 | Question AnsweringVisual Question Answering | —Unverified | 0 | 0 |
| Question Guided Modular Routing Networks for Visual Question Answering | Apr 17, 2019 | Question AnsweringVisual Question Answering | —Unverified | 0 | 0 |
| Question-Led Semantic Structure Enhanced Attentions for VQA | Nov 16, 2021 | Question AnsweringVisual Question Answering | —Unverified | 0 | 0 |
| Question Modifiers in Visual Question Answering | Jun 1, 2022 | Natural Language UnderstandingQuestion Answering | —Unverified | 0 | 0 |
| Question Relevance in Visual Question Answering | Jul 23, 2018 | Question AnsweringVisual Question Answering | —Unverified | 0 | 0 |
| Question Relevance in VQA: Identifying Non-Visual And False-Premise Questions | Jun 21, 2016 | Question AnsweringQuestion Similarity | —Unverified | 0 | 0 |
| Question Type Guided Attention in Visual Question Answering | Apr 6, 2018 | Activity RecognitionQuestion Answering | —Unverified | 0 | 0 |