Improving Users' Mental Model with Attention-directed Counterfactual Edits Oct 13, 2021 counterfactual Question Answering
— Unverified 0Improving Vision-and-Language Reasoning via Spatial Relations Modeling Nov 9, 2023 Position regression Relation
— Unverified 0Improving Visual Question Answering by Referring to Generated Paragraph Captions Jun 14, 2019 Decoder Image Captioning
— Unverified 0Improving Visual Question Answering Models through Robustness Analysis and In-Context Learning with a Chain of Basic Questions Apr 6, 2023 In-Context Learning Question Answering
— Unverified 0Aligning MAGMA by Few-Shot Learning and Finetuning Oct 18, 2022 Few-Shot Learning Image Captioning
— Unverified 0Learning Visual Knowledge Memory Networks for Visual Question Answering Jun 13, 2018 Question Answering Visual Question Answering
— Unverified 0Graph Neural Networks in Vision-Language Image Understanding: A Survey Mar 7, 2023 Image Captioning Image Retrieval
— Unverified 0Cycle-Consistency for Robust Visual Question Answering Feb 15, 2019 Question Answering Question Generation
— Unverified 0Compositional Memory for Visual Question Answering Nov 18, 2015 Question Answering Visual Question Answering
— Unverified 0In Factuality: Efficient Integration of Relevant Facts for Visual Question Answering Aug 1, 2021 Question Answering Visual Question Answering
— Unverified 0Achieving Human Parity on Visual Question Answering Nov 17, 2021 Question Answering Visual Question Answering
— Unverified 0Graph Edit Distance Reward: Learning to Edit Scene Graph Aug 15, 2020 Graph Matching Image Retrieval
— Unverified 0A survey on VQA_Datasets and Approaches May 2, 2021 Question Answering Survey
— Unverified 0A survey on knowledge-enhanced multimodal learning Nov 19, 2022 Conditional Image Generation Factual Visual Question Answering
— Unverified 0Instance-Level Trojan Attacks on Visual Question Answering via Adversarial Learning in Neuron Activation Space Apr 2, 2023 Question Answering Visual Question Answering
— Unverified 0Graph-based Heuristic Search for Module Selection Procedure in Neural Module Network Sep 30, 2020 Heuristic Search Question Answering
— Unverified 0Instruction-augmented Multimodal Alignment for Image-Text and Element Matching Apr 16, 2025 Image Augmentation Image Generation
— Unverified 0GRAM: Global Reasoning for Multi-Page VQA Jan 7, 2024 Question Answering Visual Question Answering
— Unverified 0Compositional Attention Networks for Interpretability in Natural Language Question Answering Oct 30, 2018 Logical Reasoning Question Answering
— Unverified 0Integrating Frequency-Domain Representations with Low-Rank Adaptation in Vision-Language Models Mar 8, 2025 Caption Generation Question Answering
— Unverified 0A Causal Approach to Mitigate Modality Preference Bias in Medical Visual Question Answering May 22, 2025 counterfactual Medical Visual Question Answering
— Unverified 0Learning to Recognize the Unseen Visual Predicates Sep 25, 2019 Question Answering Visual Question Answering
— Unverified 0Learning What Makes a Difference from Counterfactual Examples and Gradient Supervision Apr 20, 2020 counterfactual image-classification
— Unverified 0Interactive Visual Task Learning for Robots Dec 20, 2023 Continual Learning Novel Concepts
— Unverified 0Leveraging Visual Question Answering to Improve Text-to-Image Synthesis Oct 28, 2020 Auxiliary Learning Image Generation
— Unverified 0Davidsonian Scene Graph: Improving Reliability in Fine-grained Evaluation for Text-to-Image Generation Oct 27, 2023 Image Generation Question Answering
— Unverified 0Component Analysis for Visual Question Answering Architectures Feb 12, 2020 Question Answering Representation Learning
— Unverified 0Aligned Image-Word Representations Improve Inductive Transfer Across Vision-Language Tasks Apr 2, 2017 Multi-Task Learning Question Answering
— Unverified 0A Study on Multimodal and Interactive Explanations for Visual Question Answering Mar 1, 2020 Explainable Artificial Intelligence (XAI) Prediction
— Unverified 0Interpretable Counting for Visual Question Answering Dec 23, 2017 Question Answering Visual Question Answering
— Unverified 0Interpretable Face Anti-Spoofing: Enhancing Generalization with Multimodal Large Language Models Jan 3, 2025 Binary Classification Face Anti-Spoofing
— Unverified 0Interpretable Medical Image Visual Question Answering via Multi-Modal Relationship Graph Learning Feb 19, 2023 Graph Learning Medical Visual Question Answering
— Unverified 0Interpretable Neural Computation for Real-World Compositional Visual Question Answering Oct 10, 2020 Question Answering Visual Question Answering
— Unverified 0Interpretable Visual Question Answering Referring to Outside Knowledge Mar 8, 2023 Diversity Image Captioning
— Unverified 0Learning to Disambiguate by Asking Discriminative Questions Aug 9, 2017 Benchmarking Image Captioning
— Unverified 0Interpretable Visual Question Answering by Visual Grounding from Attention Supervision Mining Aug 1, 2018 Question Answering Visual Grounding
— Unverified 0GPT-4o System Card Oct 25, 2024 Multiple-choice Spatial Reasoning
— Unverified 0Compact Tensor Pooling for Visual Question Answering Jun 20, 2017 Question Answering Visual Question Answering
— Unverified 0Good, Better, Best: Textual Distractors Generation for Multiple-Choice Visual Question Answering via Reinforcement Learning Oct 21, 2019 Data Augmentation Decision Making
— Unverified 0Astrea: A MOE-based Visual Understanding Model with Progressive Alignment Mar 12, 2025 Contrastive Learning Cross-Modal Retrieval
— Unverified 0Aligned Vector Quantization for Edge-Cloud Collabrative Vision-Language Models Nov 8, 2024 Quantization Question Answering
— Unverified 0Goal-Oriented Semantic Communication for Wireless Visual Question Answering Nov 3, 2024 Edge-computing Question Answering
— Unverified 0Commonsense Video Question Answering through Video-Grounded Entailment Tree Reasoning Jan 9, 2025 Benchmarking Question Answering
— Unverified 0Inverse Visual Question Answering with Multi-Level Attentions Sep 17, 2019 Question Answering Visual Question Answering
— Unverified 0Getting ViT in Shape: Scaling Laws for Compute-Optimal Model Design May 22, 2023 image-classification Image Classification
— Unverified 0ComicsPAP: understanding comic strips by picking the correct panel Mar 11, 2025 Image Captioning Visual Question Answering (VQA)
— Unverified 0GeoRSMLLM: A Multimodal Large Language Model for Vision-Language Tasks in Geoscience and Remote Sensing Mar 16, 2025 Change Detection Image Captioning
— Unverified 0Decoupled Box Proposal and Featurization with Ultrafine-Grained Semantic Labels Improve Image Captioning and Visual Question Answering Sep 4, 2019 Image Captioning Object
— Unverified 0Geometry-Aware Video Quality Assessment for Dynamic Digital Human Oct 24, 2023 Attribute Video Quality Assessment
— Unverified 0Evaluating and Improving Interactions with Hazy Oracles Oct 19, 2021 Object Tracking Referring Expression
— Unverified 0