| MuKEA: Multimodal Knowledge Extraction and Accumulation for Knowledge-based Visual Question Answering | Mar 17, 2022 | Implicit RelationsQuestion Answering | CodeCode Available | 1 | 5 |
| Multi-Modal Answer Validation for Knowledge-Based VQA | Mar 23, 2021 | Question AnsweringRetrieval | CodeCode Available | 1 | 5 |
| Explaining Autonomous Driving Actions with Visual Question Answering | Jul 19, 2023 | Autonomous DrivingAutonomous Vehicles | CodeCode Available | 1 | 5 |
| GPT-4V-AD: Exploring Grounding Potential of VQA-oriented GPT-4V for Zero-shot Anomaly Detection | Nov 5, 2023 | Anomaly DetectionQuestion Answering | CodeCode Available | 1 | 5 |
| CausalChaos! Dataset for Comprehensive Causal Action Question Answering Over Longer Causal Chains Grounded in Dynamic Visual Scenes | Apr 1, 2024 | Causal DiscoveryCausal Discovery in Video Reasoning | CodeCode Available | 1 | 5 |
| GMAI-VL-R1: Harnessing Reinforcement Learning for Multimodal Medical Reasoning | Apr 2, 2025 | Decision MakingDiagnostic | CodeCode Available | 1 | 5 |
| Modular Visual Question Answering via Code Generation | Jun 8, 2023 | Code GenerationIn-Context Learning | CodeCode Available | 1 | 5 |
| Expert Knowledge-Aware Image Difference Graph Representation Learning for Difference-Aware Medical Visual Question Answering | Jul 22, 2023 | Graph Representation LearningLanguage Modeling | CodeCode Available | 1 | 5 |
| EarthVQA: Towards Queryable Earth via Relational Reasoning-Based Remote Sensing Visual Question Answering | Dec 19, 2023 | ObjectObject Counting | CodeCode Available | 1 | 5 |
| MMXU: A Multi-Modal and Multi-X-ray Understanding Dataset for Disease Progression | Feb 17, 2025 | DiagnosticQuestion Answering | CodeCode Available | 1 | 5 |