VQA-GEN: A Visual Question Answering Benchmark for Domain Generalization Nov 1, 2023 Domain Generalization Question Answering
— Unverified 00 VQA-GNN: Reasoning with Multimodal Knowledge via Graph Neural Networks for Visual Question Answering May 23, 2022 Knowledge Graphs Question Answering
— Unverified 00 VQA-LOL: Visual Question Answering under the Lens of Logic Feb 19, 2020 Negation Question Answering
— Unverified 00 VQA-MHUG: A Gaze Dataset to Study Multimodal Neural Attention in Visual Question Answering Sep 27, 2021 Question Answering Visual Question Answering
— Unverified 00 VQA Training Sets are Self-play Environments for Generating Few-shot Pools May 30, 2024 Question Answering Visual Question Answering
— Unverified 00 VQAttack: Transferable Adversarial Attacks on Visual Question Answering via Pre-trained Models Feb 16, 2024 Adversarial Robustness Language Modelling
— Unverified 00 VQA with Cascade of Self- and Co-Attention Blocks Feb 28, 2023 Question Answering Visual Question Answering
— Unverified 00 VSA4VQA: Scaling a Vector Symbolic Architecture to Visual Question Answering on Natural Images May 6, 2024 Attribute Language Modeling
— Unverified 00 Watching the News: Towards VideoQA Models that can Read Nov 10, 2022 Question Answering Video Question Answering
— Unverified 00 Weakly Supervised Visual Question Answer Generation Jun 11, 2023 Answer Generation Dependency Parsing
— Unverified 00 Weak Supervision helps Emergence of Word-Object Alignment and improves Vision-Language Tasks Dec 6, 2019 Image Retrieval Inductive Bias
— Unverified 00 Webly Supervised Concept Expansion for General Purpose Vision Models Feb 4, 2022 Human-Object Interaction Detection Image Retrieval
— Unverified 00 What is needed for simple spatial language capabilities in VQA? Aug 17, 2019 Diagnostic Question Answering
— Unverified 00 What Large Language Models Bring to Text-rich VQA? Nov 13, 2023 Image Comprehension Optical Character Recognition (OCR)
— Unverified 00 What makes a good metric? Evaluating automatic metrics for text-to-image consistency Dec 18, 2024 Sensitivity Visual Question Answering (VQA)
— Unverified 00 When are Lemons Purple? The Concept Association Bias of Vision-Language Models Dec 22, 2022 Attribute image-classification
— Unverified 00 Where is this coming from? Making groundedness count in the evaluation of Document VQA models Mar 24, 2025 Question Answering Visual Question Answering
— Unverified 00 Where To Look: Focus Regions for Visual Question Answering Nov 23, 2015 Question Answering Visual Question Answering
— Unverified 00 Which Client is Reliable?: A Reliable and Personalized Prompt-based Federated Learning for Medical Image Question Answering Oct 23, 2024 Federated Learning Medical Visual Question Answering
— Unverified 00 Why context matters in VQA and Reasoning: Semantic interventions for VLM input modalities Oct 2, 2024 Question Answering Visual Question Answering
— Unverified 00 Why Does a Visual Question Have Different Answers? Aug 12, 2019 Question Answering Visual Question Answering
— Unverified 00 Why Does the VQA Model Answer No?: Improving Reasoning through Visual and Linguistic Inference Sep 25, 2019 Common Sense Reasoning Question Answering
— Unverified 00 WoLF: Wide-scope Large Language Model Framework for CXR Understanding Mar 19, 2024 Anatomy Instruction Following
— Unverified 00 Workshop on Document Intelligence Understanding Jul 31, 2023 document understanding Visual Question Answering (VQA)
— Unverified 00 WSI-LLaVA: A Multimodal Large Language Model for Whole Slide Image Dec 3, 2024 Diagnostic Language Modeling
— Unverified 00 WuDaoMM: A large-scale Multi-Modal Dataset for Pre-training models Mar 22, 2022 Image Captioning Image Generation
— Unverified 00 XGPT: Cross-modal Generative Pre-Training for Image Captioning Mar 3, 2020 Data Augmentation Denoising
— Unverified 00 xGQA: Cross-Lingual Visual Question Answering Oct 16, 2021 Cross-Lingual Transfer Language Modeling
— Unverified 00 Yin and Yang: Balancing and Answering Binary Visual Questions Nov 16, 2015 Question Answering Visual Question Answering
— Unverified 00 YouMakeup: A Large-Scale Domain-Specific Multimodal Dataset for Fine-Grained Semantic Comprehension Nov 1, 2019 Caption Generation Question Answering
— Unverified 00 ZALM3: Zero-Shot Enhancement of Vision-Language Alignment via In-Context Information in Multi-Turn Multimodal Medical Dialogue Sep 26, 2024 Medical Visual Question Answering Question Answering
— Unverified 00 Zero-Shot Anomaly Detection in Battery Thermal Images Using Visual Question Answering with Prior Knowledge May 22, 2025 Anomaly Detection Question Answering
— Unverified 00 Zero-Shot Transfer VQA Dataset Nov 2, 2018 Question Answering Transfer Learning
— Unverified 00 Zero-Shot Video Question Answering with Procedural Programs Dec 1, 2023 Code Generation Language Modeling
— Unverified 00 Zero-Shot Visual Question Answering Nov 17, 2016 Question Answering Retrieval
— Unverified 00 Zero-Shot Visual Reasoning by Vision-Language Models: Benchmarking and Analysis Aug 27, 2024 Benchmarking Large Language Model
— Unverified 00 Bidirectional Contrastive Split Learning for Visual Question Answering Aug 24, 2022 Adversarial Attack Backdoor Attack
— Unverified 00 Generating Question Relevant Captions to Aid Visual Question Answering Jun 3, 2019 General Knowledge Image Captioning
— Unverified 00 Explainable High-order Visual Question Reasoning: A New Benchmark and Knowledge-routed Network Sep 23, 2019 Question Answering Triplet
— Unverified 00 Does CLIP Benefit Visual Question Answering in the Medical Domain as Much as it Does in the General Domain? Dec 27, 2021 Articles Medical Visual Question Answering
— Unverified 00 Prompting Medical Large Vision-Language Models to Diagnose Pathologies by Visual Question Answering Jul 31, 2024 Diagnostic Hallucination
— Unverified 00 Ontology-based knowledge representation for bone disease diagnosis: a foundation for safe and sustainable medical artificial intelligence systems Jun 5, 2025 Diagnostic Multimodal Deep Learning
— Unverified 00 2nd Place Solution to the GQA Challenge 2019 Jul 16, 2019 Question Answering Visual Question Answering
— Unverified 00 3D Concept Learning and Reasoning from Multi-View Images Mar 20, 2023 Question Answering Visual Question Answering
— Unverified 00 3D-CT-GPT: Generating 3D Radiology Reports through Integration of Large Vision-Language Models Sep 28, 2024 Diagnostic Language Modeling
— Unverified 00 3D Question Answering Dec 15, 2021 3D geometry Question Answering
— Unverified 00 ABC: Achieving Better Control of Multimodal Embeddings using VLMs Mar 1, 2025 Image to text Image-to-Text Retrieval
— Unverified 00 ABC-CNN: An Attention Based Convolutional Neural Network for Visual Question Answering Nov 18, 2015 Question Answering Visual Question Answering
— Unverified 00 Abduction of Domain Relationships from Data for VQA Feb 13, 2025 Question Answering Visual Question Answering
— Unverified 00 A Causal Approach to Mitigate Modality Preference Bias in Medical Visual Question Answering May 22, 2025 counterfactual Medical Visual Question Answering
— Unverified 00