Hierarchical Memory for Long Video QA Jun 30, 2024 GPU Question Answering
— Unverified 00 Hierarchical Modeling for Medical Visual Question Answering with Cross-Attention Fusion Apr 4, 2025 Diagnostic Medical Visual Question Answering
— Unverified 00 High Frame Rate Video Quality Assessment using VMAF and Entropic Differences Sep 27, 2021 Video Quality Assessment Visual Question Answering (VQA)
— Unverified 00 Highly Efficient No-reference 4K Video Quality Assessment with Full-Pixel Covering Sampling and Training Strategy Jul 30, 2024 4k Video Quality Assessment
— Unverified 00 Hints of Prompt: Enhancing Visual Representation for Multimodal LLMs in Autonomous Driving Nov 20, 2024 Autonomous Driving Multimodal Reasoning
— Unverified 00 HiTeA: Hierarchical Temporal-Aware Video-Language Pre-training Dec 30, 2022 cross-modal alignment TGIF-Action
— Unverified 00 How Far Can Off-the-Shelf Multimodal Large Language Models Go in Online Episodic Memory Question Answering? Jun 19, 2025 Multiple-choice Question Answering
— Unverified 00 How good are deep models in understanding the generated images? Aug 23, 2022 Object Object Recognition
— Unverified 00 How Much Can CLIP Benefit Vision-and-Language Tasks? Sep 29, 2021 Question Answering Visual Entailment
— Unverified 00 How (not) to ensemble LVLMs for VQA Oct 10, 2023 Retrieval Visual Question Answering (VQA)
— Unverified 00 How to Design Sample and Computationally Efficient VQA Models Mar 22, 2021 Question Answering Visual Question Answering
— Unverified 00 How to find a good image-text embedding for remote sensing visual question answering? Sep 24, 2021 Question Answering Visual Question Answering
— Unverified 00 How Transferable are Reasoning Patterns in VQA? Apr 8, 2021 Question Answering Visual Question Answering
— Unverified 00 How Well Can Vison-Language Models Understand Humans' Intention? An Open-ended Theory of Mind Question Evaluation Benchmark Mar 28, 2025 Question Answering Visual Question Answering
— Unverified 00 HRVQA: A Visual Question Answering Benchmark for High-Resolution Aerial Images Jan 23, 2023 Attribute Question Answering
— Unverified 00 Human-Adversarial Visual Question Answering Jun 4, 2021 Question Answering Visual Question Answering
— Unverified 00 Human Attention in Visual Question Answering: Do Humans and Deep Networks Look at the Same Regions? Jun 17, 2016 Question Answering Visual Question Answering
— Unverified 00 Human Attention in Visual Question Answering: Do Humans and Deep Networks Look at the Same Regions? Jun 11, 2016 Question Answering Visual Question Answering
— Unverified 00 Hummingbird: High Fidelity Image Generation via Multimodal Context Alignment Feb 7, 2025 Diversity Human-Object Interaction Detection
— Unverified 00 HVS Revisited: A Comprehensive Video Quality Assessment Framework Oct 9, 2022 Video Quality Assessment Visual Question Answering (VQA)
— Unverified 00 Hyperbolic Attention Networks May 24, 2018 Machine Translation Question Answering
— Unverified 00 Hyper-dimensional computing for a visual question-answering system that is trainable end-to-end Nov 28, 2017 Question Answering Visual Question Answering
— Unverified 00 Hypo3D: Exploring Hypothetical Reasoning in 3D Feb 2, 2025 Question Answering Visual Question Answering
— Unverified 00 ICDAR 2019 Competition on Scene Text Visual Question Answering Jun 30, 2019 Question Answering Visual Question Answering
— Unverified 00 ICDAR 2021 Competition on Document VisualQuestion Answering Nov 10, 2021 Visual Question Answering (VQA)
— Unverified 00 CLIPPO: Image-and-Language Understanding from Pixels Only Dec 15, 2022 Contrastive Learning image-classification
— Unverified 00 Image Captioning and Visual Question Answering Based on Attributes and External Knowledge Mar 9, 2016 General Knowledge Image Captioning
— Unverified 00 Image Captioning with Compositional Neural Module Networks Jul 10, 2020 Image Captioning Question Answering
— Unverified 00 Image Manipulation via Multi-Hop Instructions -- A New Dataset and Weakly-Supervised Neuro-Symbolic Approach May 23, 2023 Image Manipulation Question Answering
— Unverified 00 Image Position Prediction in Multimodal Documents May 1, 2020 Articles Caption Generation
— Unverified 00 Image Semantic Relation Generation Oct 19, 2022 Image Retrieval Image Segmentation
— Unverified 00 ImageTTR: Grounding Type Theory with Records in Image Classification for Visual Question Answering Jun 1, 2019 General Classification image-classification
— Unverified 00 Improved Bilinear Pooling with CNNs Jul 21, 2017 GPU Question Answering
— Unverified 00 Improved Few-Shot Image Classification Through Multiple-Choice Questions Jul 23, 2024 Articles Few-Shot Image Classification
— Unverified 00 Improving and Diagnosing Knowledge-Based Visual Question Answering via Entity Enhanced Knowledge Injection Dec 13, 2021 Common Sense Reasoning Knowledge Graph Embeddings
— Unverified 00 Improving Automatic VQA Evaluation Using Large Language Models Oct 4, 2023 In-Context Learning Question Answering
— Unverified 00 Improving Cross-Modal Understanding in Visual Dialog via Contrastive Learning Apr 15, 2022 Contrastive Learning Question Answering
— Unverified 00 Improving Data Augmentation for Robust Visual Question Answering with Effective Curriculum Learning Jan 28, 2024 Data Augmentation Question Answering
— Unverified 00 Improving Generalization in Visual Reasoning via Self-Ensemble Oct 28, 2024 Visual Question Answering (VQA) Visual Reasoning
— Unverified 00 Improving Medical Reasoning with Curriculum-Aware Reinforcement Learning May 25, 2025 Out-of-Distribution Generalization reinforcement-learning
— Unverified 00 Improving mitosis detection on histopathology images using large vision-language models Oct 11, 2023 Domain Generalization Image Captioning
— Unverified 00 Improving Users' Mental Model with Attention-directed Counterfactual Edits Oct 13, 2021 counterfactual Question Answering
— Unverified 00 Improving Vision-and-Language Reasoning via Spatial Relations Modeling Nov 9, 2023 Position regression Relation
— Unverified 00 Improving Visual Question Answering by Referring to Generated Paragraph Captions Jun 14, 2019 Decoder Image Captioning
— Unverified 00 Improving Visual Question Answering Models through Robustness Analysis and In-Context Learning with a Chain of Basic Questions Apr 6, 2023 In-Context Learning Question Answering
— Unverified 00 Improving VQA and its Explanations \\ by Comparing Competing Explanations Jun 28, 2020 Question Answering Visual Question Answering
— Unverified 00 Incorporating External Knowledge to Answer Open-Domain Visual Questions with Dynamic Memory Networks Dec 3, 2017 Question Answering Visual Question Answering
— Unverified 00 In Factuality: Efficient Integration of Relevant Facts for Visual Question Answering Aug 1, 2021 Question Answering Visual Question Answering
— Unverified 00 InfographicVQA Apr 26, 2021 Question Answering Visual Question Answering
— Unverified 00 Instance-Level Trojan Attacks on Visual Question Answering via Adversarial Learning in Neuron Activation Space Apr 2, 2023 Question Answering Visual Question Answering
— Unverified 00