Open-Ended Visual Question Answering by Multi-Modal Domain Adaptation Nov 11, 2019 Domain Adaptation Question Answering
— Unverified 0OptiBox: Breaking the Limits of Proposals for Visual Grounding Nov 29, 2019 Image Captioning Visual Grounding
— Unverified 0Optimizing Explanations by Network Canonization and Hyperparameter Search Nov 30, 2022 Explainable Artificial Intelligence (XAI) image-classification
— Unverified 0Optimizing Vision-Language Interactions Through Decoder-Only Models Dec 14, 2024 Decoder Image Captioning
— Unverified 0Optimizing Visual Question Answering Models for Driving: Bridging the Gap Between Human and Machine Attention Patterns Jun 13, 2024 Autonomous Driving Question Answering
— Unverified 0ORD: Object Relationship Discovery for Visual Dialogue Generation Jun 15, 2020 Dialogue Generation Graph Attention
— Unverified 0ORION: A Holistic End-to-End Autonomous Driving Framework by Vision-Language Instructed Action Generation Mar 25, 2025 Action Generation Autonomous Driving
— Unverified 0Out of the Box: Reasoning with Graph Convolution Nets for Factual Visual Question Answering Nov 1, 2018 Factual Visual Question Answering General Knowledge
— Unverified 0Overcoming Language Bias in Remote Sensing Visual Question Answering via Adversarial Training Jun 1, 2023 Question Answering Visual Question Answering
— Unverified 0Overcoming Language Priors for Visual Question Answering Based on Knowledge Distillation Jan 10, 2025 Knowledge Distillation Question Answering
— Unverified 0Overcoming Language Priors in Visual Question Answering with Adversarial Regularization Oct 8, 2018 Question Answering Visual Grounding
— Unverified 0OVQA: A Clinically Generated Visual Question Answering Dataset Jul 7, 2022 Benchmarking Medical Visual Question Answering
— Unverified 0PaLI: A Jointly-Scaled Multilingual Language-Image Model Sep 14, 2022 Decoder Few-Shot Image Classification
— Unverified 0PaLM2-VAdapter: Progressively Aligned Language Model Makes a Strong Vision-language Adapter Feb 16, 2024 Language Modeling Language Modelling
— Unverified 0PAM: Understanding Product Images in Cross Product Category Attribute Extraction Jun 8, 2021 Attribute Attribute Extraction
— Unverified 0NAPA: Intermediate-level Variational Native-pulse Ansatz for Variational Quantum Algorithms Aug 2, 2022 Neural Architecture Search Visual Question Answering (VQA)
— Unverified 0Parameter-Parallel Distributed Variational Quantum Algorithm Jul 31, 2022 Visual Question Answering (VQA)
— Unverified 0ParsVQA-Caps: A Benchmark for Visual Question Answering and Image Captioning in Persian Dec 7, 2022 Image Captioning Question Answering
— Unverified 0Pathological Visual Question Answering Oct 6, 2020 AI Agent Question Answering
— Unverified 0PathVLM-R1: A Reinforcement Learning-Driven Reasoning Model for Pathology Visual-Language Tasks Apr 12, 2025 Computed Tomography (CT) Question Answering
— Unverified 0PDF-MVQA: A Dataset for Multimodal Information Retrieval in PDF-based Visual Question Answering Apr 19, 2024 Articles Information Retrieval
— Unverified 0PDFVQA: A New Dataset for Real-World VQA on PDF Documents Apr 13, 2023 document understanding Key Information Extraction
— Unverified 0Perception Test 2024: Challenge Summary and a Novel Hour-Long VideoQA Benchmark Nov 29, 2024 Benchmarking Grounded Video Question Answering
— Unverified 0Perceptual Quality Assessment of UGC Gaming Videos Mar 31, 2022 Video Quality Assessment Visual Question Answering (VQA)
— Unverified 0Performance Analysis of Traditional VQA Models Under Limited Computational Resources Feb 9, 2025 Question Answering Visual Question Answering
— Unverified 0PhyBlock: A Progressive Benchmark for Physical Understanding and Planning via 3D Block Assembly Jun 10, 2025 Question Answering Scene Understanding
— Unverified 0PiggyBack: Pretrained Visual Question Answering Environment for Backing up Non-deep Learning Professionals Nov 29, 2022 Deep Learning Question Answering
— Unverified 0PlanGPT-VL: Enhancing Urban Planning with Domain-Specific Vision-Language Models May 20, 2025 Visual Question Answering (VQA)
— Unverified 0Playing Lottery Tickets with Vision and Language Apr 23, 2021 Image-text Retrieval Question Answering
— Unverified 0Polar-VQA: Visual Question Answering on Remote Sensed Ice sheet Imagery from Polar Region Mar 13, 2023 Question Answering Visual Question Answering
— Unverified 0Precision Empowers, Excess Distracts: Visual Question Answering With Dynamically Infused Knowledge In Language Models Jun 14, 2024 Decoder Knowledge Graphs
— Unverified 0Predicting Relative Depth between Objects from Semantic Features Jan 12, 2021 Question Answering Visual Question Answering
— Unverified 0PreSTU: Pre-Training for Scene-Text Understanding Sep 12, 2022 Decoder Image Captioning
— Unverified 0Pre-training image-language transformers for open-vocabulary tasks Sep 9, 2022 Question Answering Visual Entailment
— Unverified 0Priorformer: A UGC-VQA Method with content and distortion priors Jun 24, 2024 Video Quality Assessment Visual Question Answering (VQA)
— Unverified 0Privacy-Aware Visual Language Models May 27, 2024 Visual Question Answering (VQA)
— Unverified 0Privacy Preserving Visual Question Answering Feb 15, 2022 Privacy Preserving Question Answering
— Unverified 0PRNet: A Progressive Regression Network for No-Reference User-Generated-Content Video Quality Assessment Sep 29, 2021 regression Video Quality Assessment
— Unverified 0Probabilistic Neural-symbolic Models for Interpretable Visual Question Answering Feb 21, 2019 counterfactual Question Answering
— Unverified 0Probing Inter-modality: Visual Parsing with Self-Attention for Vision-Language Pre-training Jun 25, 2021 Image-text Retrieval Question Answering
— Unverified 0Probing Inter-modality: Visual Parsing with Self-Attention for Vision-and-Language Pre-training May 21, 2021 Question Answering Relation
— Unverified 0Probing the Role of Positional Information in Vision-Language Models Jan 16, 2022 Contrastive Learning Image-text matching
— Unverified 0Probing Visual Language Priors in VLMs Dec 31, 2024 Question Answering Visual Question Answering
— Unverified 0ProcTag: Process Tagging for Assessing the Efficacy of Document Instruction Data Jul 17, 2024 Question Answering Visual Question Answering
— Unverified 0Progressive Attention Memory Network for Movie Story Question Answering Apr 18, 2019 Question Answering Video Story QA
— Unverified 0Prolonged Reasoning Is Not All You Need: Certainty-Based Adaptive Routing for Efficient LLM/MLLM Reasoning May 21, 2025 All Visual Question Answering (VQA)
— Unverified 0Prompt-based Personalized Federated Learning for Medical Visual Question Answering Feb 15, 2024 Federated Learning Medical Visual Question Answering
— Unverified 0PromptCap: Prompt-Guided Image Captioning for VQA with GPT-3 Jan 1, 2023 Image Captioning Question Answering
— Unverified 0Prompting Large Language Models with Rationale Heuristics for Knowledge-based Visual Question Answering Dec 22, 2024 Question Answering Visual Question Answering
— Unverified 0Prompt Tuning for Generative Multimodal Pretrained Models Aug 4, 2022 Image Captioning Visual Entailment
— Unverified 0