| Order Matters: Exploring Order Sensitivity in Multimodal Large Language Models | Oct 22, 2024 | In-Context LearningQuestion Answering | —Unverified | 0 | 0 |
| ORD: Object Relationship Discovery for Visual Dialogue Generation | Jun 15, 2020 | Dialogue GenerationGraph Attention | —Unverified | 0 | 0 |
| ORION: A Holistic End-to-End Autonomous Driving Framework by Vision-Language Instructed Action Generation | Mar 25, 2025 | Action GenerationAutonomous Driving | —Unverified | 0 | 0 |
| Visual Graph Question Answering with ASP and LLMs for Language Parsing | Feb 13, 2025 | Graph Question AnsweringOptical Character Recognition | —Unverified | 0 | 0 |
| Data Metabolism: An Efficient Data Design Schema For Vision Language Model | Apr 10, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| Data-Driven Calibration of Prediction Sets in Large Vision-Language Models Based on Inductive Conformal Prediction | Apr 24, 2025 | Conformal PredictionHallucination | —Unverified | 0 | 0 |
| Out of the Box: Reasoning with Graph Convolution Nets for Factual Visual Question Answering | Nov 1, 2018 | Factual Visual Question AnsweringGeneral Knowledge | —Unverified | 0 | 0 |
| Visual Grounding Strategies for Text-Only Natural Language Processing | Mar 25, 2021 | Image RetrievalLanguage Modeling | —Unverified | 0 | 0 |
| Data Augmentation for Visual Question Answering | Sep 1, 2017 | Data AugmentationGeneral Classification | —Unverified | 0 | 0 |
| Overcoming Language Bias in Remote Sensing Visual Question Answering via Adversarial Training | Jun 1, 2023 | Question AnsweringVisual Question Answering | —Unverified | 0 | 0 |
| Overcoming Language Priors for Visual Question Answering Based on Knowledge Distillation | Jan 10, 2025 | Knowledge DistillationQuestion Answering | —Unverified | 0 | 0 |
| Overcoming Language Priors in Visual Question Answering with Adversarial Regularization | Oct 8, 2018 | Question AnsweringVisual Grounding | —Unverified | 0 | 0 |
| Visual Hallucination: Definition, Quantification, and Prescriptive Remediations | Mar 26, 2024 | HallucinationImage Captioning | —Unverified | 0 | 0 |
| DARE: Diverse Visual Question Answering with Robustness Evaluation | Sep 26, 2024 | image-classificationImage Classification | —Unverified | 0 | 0 |
| Overview of TREC 2024 Medical Video Question Answering (MedVidQA) Track | Dec 15, 2024 | Image CaptioningMedical Question Answering | —Unverified | 0 | 0 |
| OVQA: A Clinically Generated Visual Question Answering Dataset | Jul 7, 2022 | BenchmarkingMedical Visual Question Answering | —Unverified | 0 | 0 |
| OWLViz: An Open-World Benchmark for Visual Question Answering | Mar 4, 2025 | Question AnsweringVisual Question Answering | —Unverified | 0 | 0 |
| PaLI: A Jointly-Scaled Multilingual Language-Image Model | Sep 14, 2022 | DecoderFew-Shot Image Classification | —Unverified | 0 | 0 |
| Damage Assessment after Natural Disasters with UAVs: Semantic Feature Extraction using Deep Learning | Dec 14, 2024 | Decision MakingQuestion Answering | —Unverified | 0 | 0 |
| PaLM2-VAdapter: Progressively Aligned Language Model Makes a Strong Vision-language Adapter | Feb 16, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| Cycle-Consistency for Robust Visual Question Answering | Feb 15, 2019 | Question AnsweringQuestion Generation | —Unverified | 0 | 0 |
| PAM: Understanding Product Images in Cross Product Category Attribute Extraction | Jun 8, 2021 | AttributeAttribute Extraction | —Unverified | 0 | 0 |
| CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark | Jun 10, 2024 | DiversityQuestion Answering | —Unverified | 0 | 0 |
| C-VQA: A Compositional Split of the Visual Question Answering (VQA) v1.0 Dataset | Apr 26, 2017 | Question AnsweringVisual Question Answering | —Unverified | 0 | 0 |
| AI2D-RST: A multimodal corpus of 1000 primary school science diagrams | Dec 9, 2019 | Question AnsweringVisual Question Answering | —Unverified | 0 | 0 |