| Reducing Language Biases in Visual Question Answering with Visually-Grounded Question Encoder | Jul 13, 2020 | Question AnsweringVisual Grounding | —Unverified | 0 | 0 |
| CLIP Models are Few-shot Learners: Empirical Studies on VQA and Visual Entailment | Mar 14, 2022 | parameter-efficient fine-tuningQuestion Answering | —Unverified | 0 | 0 |
| Visual question answering: from early developments to recent advances -- a survey | Jan 7, 2025 | DescriptiveNatural Language Understanding | —Unverified | 0 | 0 |
| Regularizing Attention Networks for Anomaly Detection in Visual Question Answering | Sep 21, 2020 | Anomaly DetectionQuestion Answering | —Unverified | 0 | 0 |
| Visual Question Answering in Ophthalmology: A Progressive and Practical Perspective | Oct 22, 2024 | Question AnsweringVisual Question Answering | —Unverified | 0 | 0 |
| CLEVR-POC: Reasoning-Intensive Visual Question Answering in Partially Observable Environments | Mar 5, 2024 | Language ModellingLarge Language Model | —Unverified | 0 | 0 |
| ReLoop: "Seeing Twice and Thinking Backwards" via Closed-loop Training to Mitigate Hallucinations in Multimodal understanding | Jul 7, 2025 | HallucinationQuestion Answering | —Unverified | 0 | 0 |
| Visual Question Answering in Remote Sensing with Cross-Attention and Multimodal Information Bottleneck | Jun 25, 2023 | object-detectionObject Detection | —Unverified | 0 | 0 |
| Remote Sensing Vision-Language Foundation Models without Annotations via Ground Remote Alignment | Dec 12, 2023 | image-classificationImage Classification | —Unverified | 0 | 0 |
| CL-CrossVQA: A Continual Learning Benchmark for Cross-Domain Visual Question Answering | Nov 19, 2022 | Continual LearningQuestion Answering | —Unverified | 0 | 0 |
| Claude 3.5 Sonnet Model Card Addendum | Jun 24, 2024 | Code GenerationMMR total | —Unverified | 0 | 0 |
| Rephrasing visual questions by specifying the entropy of the answer distribution | Apr 10, 2020 | Question AnsweringVisual Question Answering | —Unverified | 0 | 0 |
| Representation, Learning and Reasoning on Spatial Language for Downstream NLP Tasks | Nov 1, 2020 | Common Sense ReasoningQuestion Answering | —Unverified | 0 | 0 |
| Representing Movie Characters in Dialogues | Nov 1, 2019 | Question AnsweringRelation Classification | —Unverified | 0 | 0 |
| Reproducibility Report for "Learning To Count Objects In Natural Images For Visual Question Answering" | May 21, 2018 | Question AnsweringVisual Question Answering | —Unverified | 0 | 0 |
| RepsNet: Combining Vision with Language for Automated Medical Reports | Sep 27, 2022 | Contrastive LearningDecoder | —Unverified | 0 | 0 |
| RescueADI: Adaptive Disaster Interpretation in Remote Sensing Images with Autonomous Agents | Oct 17, 2024 | Question AnsweringTask Planning | —Unverified | 0 | 0 |
| Visual Question Answering Instruction: Unlocking Multimodal Large Language Model To Domain-Specific Visual Multitasks | Feb 13, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 | 0 |
| CLAMP: Contrastive LAnguage Model Prompt-tuning | Dec 4, 2023 | Contrastive LearningImage Captioning | —Unverified | 0 | 0 |
| Reassessing Evaluation Practices in Visual Question Answering: A Case Study on Out-of-Distribution Generalization | May 24, 2022 | Image CaptioningOut-of-Distribution Generalization | —Unverified | 0 | 0 |
| Rethinking Visual Prompting for Multimodal Large Language Models with External Knowledge | Jul 5, 2024 | Instance SegmentationOptical Character Recognition (OCR) | —Unverified | 0 | 0 |
| VrR-VG: Refocusing Visually-Relevant Relationships | Feb 1, 2019 | Image CaptioningQuestion Answering | —Unverified | 0 | 0 |
| Retrieval-Augmented Natural Language Reasoning for Explainable Visual Question Answering | Aug 30, 2024 | DecoderLanguage Modeling | —Unverified | 0 | 0 |
| CIC: A Framework for Culturally-Aware Image Captioning | Feb 8, 2024 | DescriptiveImage Captioning | —Unverified | 0 | 0 |
| Retrieval-Augmented Visual Question Answering via Built-in Autoregressive Search Engines | Feb 23, 2025 | Answer GenerationLanguage Modeling | —Unverified | 0 | 0 |