| "See the World, Discover Knowledge": A Chinese Factuality Evaluation for Large Vision Language Models | Feb 17, 2025 | Object RecognitionQuestion Answering | —Unverified | 0 | 0 |
| SegEQA: Video Segmentation Based Visual Attention for Embodied Question Answering | Oct 1, 2019 | Embodied Question AnsweringQuestion Answering | —Unverified | 0 | 0 |
| Segmentation-guided Attention for Visual Question Answering from Remote Sensing Images | Jul 11, 2024 | Question AnsweringSegmentation | —Unverified | 0 | 0 |
| Segmentation Guided Attention Networks for Visual Question Answering | Jul 1, 2017 | Common Sense ReasoningQuestion Answering | —Unverified | 0 | 0 |
| Select2Plan: Training-Free ICL-Based Planning through VQA and Memory Retrieval | Nov 6, 2024 | Autonomous NavigationIn-Context Learning | —Unverified | 0 | 0 |
| Selectively Answering Visual Questions | Jun 3, 2024 | AvgIn-Context Learning | —Unverified | 0 | 0 |
| Visual Question Reasoning on General Dependency Tree | Mar 31, 2018 | Question AnsweringVisual Question Answering | —Unverified | 0 | 0 |
| Explainable High-order Visual Question Reasoning: A New Benchmark and Knowledge-routed Network | Sep 23, 2019 | Question AnsweringTriplet | —Unverified | 0 | 0 |
| SelfGraphVQA: A Self-Supervised Graph Neural Network for Scene-based Question Answering | Oct 3, 2023 | Graph Neural NetworkQuestion Answering | —Unverified | 0 | 0 |
| Self-Segregating and Coordinated-Segregating Transformer for Focused Deep Multi-Modular Network for Visual Question Answering | Jun 25, 2020 | DiversityQuestion Answering | —Unverified | 0 | 0 |
| Can you even tell left from right? Presenting a new challenge for VQA | Mar 15, 2022 | Question AnsweringVisual Question Answering | —Unverified | 0 | 0 |
| Can We Generate Visual Programs Without Prompting LLMs? | Dec 11, 2024 | Data AugmentationQuestion Answering | —Unverified | 0 | 0 |
| WeaQA: Weak Supervision via Captions for Visual Question Answering | Dec 4, 2020 | Question AnsweringVisual Question Answering | —Unverified | 0 | 0 |
| Visual Reference Resolution using Attention Memory for Visual Dialog | Sep 23, 2017 | Parameter PredictionQuestion Answering | —Unverified | 0 | 0 |
| Self-Training Large Language Models for Improved Visual Program Synthesis With Visual Reinforcement | Apr 6, 2024 | Image-text Retrievalobject-detection | —Unverified | 0 | 0 |
| Semantic Aligned Multi-modal Transformer for Vision-LanguageUnderstanding: A Preliminary Study on Visual QA | Jun 1, 2021 | Question AnsweringVisual Question Answering | —Unverified | 0 | 0 |
| Visual Relationship Detection using Scene Graphs: A Survey | May 16, 2020 | Graph GenerationImage Generation | —Unverified | 0 | 0 |
| Semantic-aware Modular Capsule Routing for Visual Question Answering | Jul 21, 2022 | Question AnsweringVisual Question Answering | —Unverified | 0 | 0 |
| Semantic Composition in Visually Grounded Language Models | May 15, 2023 | Image CaptioningInductive Bias | —Unverified | 0 | 0 |
| Semantic-enhanced Modality-asymmetric Retrieval for Online E-commerce Search | Jun 25, 2025 | Question AnsweringRetrieval | —Unverified | 0 | 0 |
| Can Visual Language Models Replace OCR-Based Visual Question Answering Pipelines in Production? A Case Study in Retail | Aug 28, 2024 | Optical Character RecognitionOptical Character Recognition (OCR) | —Unverified | 0 | 0 |
| Sensor2Text: Enabling Natural Language Interactions for Daily Activity Tracking Using Wearable Sensors | Oct 26, 2024 | Question AnsweringTransfer Learning | —Unverified | 0 | 0 |
| Sentence Attention Blocks for Answer Grounding | Sep 20, 2023 | Question AnsweringSentence | —Unverified | 0 | 0 |
| ABC-CNN: An Attention Based Convolutional Neural Network for Visual Question Answering | Nov 18, 2015 | Question AnsweringVisual Question Answering | —Unverified | 0 | 0 |
| Can SAR improve RSVQA performance? | Aug 28, 2024 | Question AnsweringVisual Question Answering | —Unverified | 0 | 0 |