From Shallow to Deep: Compositional Reasoning over Graphs for Visual Question Answering Jun 25, 2022 Question Answering Visual Question Answering
— Unverified 00 From Strings to Things: Knowledge-Enabled VQA Model That Can Read and Reason Oct 1, 2019 Graph Neural Network Question Answering
— Unverified 00 CRIC: A VQA Dataset for Compositional Reasoning on Vision and Commonsense Aug 8, 2019 Question Answering Visual Question Answering (VQA)
— Unverified 00 From Wrong To Right: A Recursive Approach Towards Vision-Language Explanation Nov 21, 2023 Explanation Generation Visual Question Answering (VQA)
— Unverified 00 Full-reference Video Quality Assessment for User Generated Content Transcoding Dec 19, 2023 Video Quality Assessment Visual Question Answering (VQA)
— Unverified 00 FunBench: Benchmarking Fundus Reading Skills of MLLMs Mar 2, 2025 Anatomy Benchmarking
— Unverified 00 Fusion of Detected Objects in Text for Visual Question Answering Aug 14, 2019 Question Answering Visual Commonsense Reasoning
— Unverified 00 Fusion of Domain-Adapted Vision and Language Models for Medical Visual Question Answering Apr 24, 2024 Language Modeling Language Modelling
— Unverified 00 FVQA 2.0: Introducing Adversarial Samples into Fact-based Visual Question Answering Mar 19, 2023 Common Sense Reasoning Information Retrieval
— Unverified 00 FVQA: Fact-based Visual Question Answering Jun 17, 2016 Common Sense Reasoning Question Answering
— Unverified 00 GC-KBVQA: A New Four-Stage Framework for Enhancing Knowledge Based Visual Question Answering Performance May 25, 2025 Caption Generation Question Answering
— Unverified 00 GEMeX: A Large-Scale, Groundable, and Explainable Medical VQA Benchmark for Chest X-ray Diagnosis Nov 25, 2024 Medical Visual Question Answering Multiple-choice
— Unverified 00 GEMeX-ThinkVG: Towards Thinking with Visual Grounding in Medical VQA via Reinforcement Learning Jun 22, 2025 Answer Generation Decision Making
— Unverified 00 Gemini Pro Defeated by GPT-4V: Evidence from Education Dec 27, 2023 image-classification Image Classification
— Unverified 00 Gender and Racial Bias in Visual Question Answering Datasets May 17, 2022 Question Answering Visual Question Answering
— Unverified 00 Generalization Differences between End-to-End and Neuro-Symbolic Vision-Language Reasoning Systems Oct 26, 2022 Question Answering Visual Question Answering
— Unverified 00 Generalized Hadamard-Product Fusion Operators for Visual Question Answering Mar 26, 2018 Neural Architecture Search Question Answering
— Unverified 00 Generate then Select: Open-ended Visual Question Answering Guided by World Knowledge May 30, 2023 Answer Selection Question Answering
— Unverified 00 Generating Natural Language Explanations for Visual Question Answering using Scene Graphs and Visual Attention Feb 15, 2019 Explanation Generation Language Modeling
— Unverified 00 Generating Natural Questions from Images for Multimodal Assistants Nov 17, 2020 Common Sense Reasoning Natural Questions
— Unverified 00 Generating Rationales in Visual Question Answering Apr 4, 2020 Question Answering Visual Question Answering
— Unverified 00 Generating Triples with Adversarial Networks for Scene Graph Construction Feb 7, 2018 Attribute graph construction
— Unverified 00 Generative Visual Question Answering Jul 18, 2023 Generative Visual Question Answering Question Answering
— Unverified 00 Geometry-Aware Video Quality Assessment for Dynamic Digital Human Oct 24, 2023 Attribute Video Quality Assessment
— Unverified 00 GeoRSMLLM: A Multimodal Large Language Model for Vision-Language Tasks in Geoscience and Remote Sensing Mar 16, 2025 Change Detection Image Captioning
— Unverified 00 Getting ViT in Shape: Scaling Laws for Compute-Optimal Model Design May 22, 2023 image-classification Image Classification
— Unverified 00 Goal-Oriented Semantic Communication for Wireless Visual Question Answering Nov 3, 2024 Edge-computing Question Answering
— Unverified 00 Good, Better, Best: Textual Distractors Generation for Multiple-Choice Visual Question Answering via Reinforcement Learning Oct 21, 2019 Data Augmentation Decision Making
— Unverified 00 GPT-4o System Card Oct 25, 2024 Multiple-choice Spatial Reasoning
— Unverified 00 GRAM: Global Reasoning for Multi-Page VQA Jan 7, 2024 Question Answering Visual Question Answering
— Unverified 00 Graph-based Heuristic Search for Module Selection Procedure in Neural Module Network Sep 30, 2020 Heuristic Search Question Answering
— Unverified 00 Graph Edit Distance Reward: Learning to Edit Scene Graph Aug 15, 2020 Graph Matching Image Retrieval
— Unverified 00 Graph Neural Networks in Vision-Language Image Understanding: A Survey Mar 7, 2023 Image Captioning Image Retrieval
— Unverified 00 Bilinear Graph Networks for Visual Question Answering Jul 23, 2019 Question Answering Visual Question Answering
— Unverified 00 Graph Relation Transformer: Incorporating pairwise object features into the Transformer architecture Nov 11, 2021 Graph Attention Question Answering
— Unverified 00 Graph-Structured Representations for Visual Question Answering Sep 19, 2016 Multiple-choice Question Answering
— Unverified 00 Griffon-G: Bridging Vision-Language and Vision-Centric Tasks via Large Multimodal Models Oct 21, 2024 Instruction Following object-detection
— Unverified 00 Grounded Word Sense Translation Jun 1, 2019 Grounded language learning Machine Translation
— Unverified 00 Grounding Answers for Visual Questions Asked by Visually Impaired People Jun 20, 2022 Question Answering Visual Question Answering
— Unverified 00 Grounding Chest X-Ray Visual Question Answering with Generated Radiology Reports May 22, 2025 Answer Generation Question Answering
— Unverified 00 Grounding Complex Navigational Instructions Using Scene Graphs Jun 3, 2021 Question Answering reinforcement-learning
— Unverified 00 Guiding Medical Vision-Language Models with Explicit Visual Prompts: Framework Design and Comprehensive Exploration of Prompt Variations Jan 4, 2025 Decoder Visual Question Answering (VQA)
— Unverified 00 Guiding Visual Question Answering with Attention Priors May 25, 2022 Question Answering Visual Grounding
— Unverified 00 Guiding Visual Question Generation Oct 15, 2021 Question Generation Question-Generation
— Unverified 00 HAMMR: HierArchical MultiModal React agents for generic VQA Apr 8, 2024 Optical Character Recognition (OCR) Question Answering
— Unverified 00 Hardware-Friendly Static Quantization Method for Video Diffusion Transformers Feb 20, 2025 Quantization Video Generation
— Unverified 00 HAUR: Human Annotation Understanding and Recognition Through Text-Heavy Images Dec 24, 2024 Optical Character Recognition (OCR) Question Answering
— Unverified 00 HD-EPIC: A Highly-Detailed Egocentric Video Dataset Feb 6, 2025 Action Recognition Nutrition
— Unverified 00 HDR-ChipQA: No-Reference Quality Assessment on High Dynamic Range Videos Apr 25, 2023 Video Quality Assessment Visual Question Answering (VQA)
— Unverified 00 Hierarchical Graph Attention Network for Few-Shot Visual-Semantic Learning Jan 1, 2021 Graph Attention Image Captioning
— Unverified 00