| Detecting Knowledge Boundary of Vision Large Language Models by Sampling-Based Inference | Feb 25, 2025 | Question AnsweringRAG | CodeCode Available | 0 |
| IIU: Independent Inference Units for Knowledge-based Visual Question Answering | Aug 15, 2024 | Question AnsweringVisual Question Answering | CodeCode Available | 0 |
| Traveling Across Languages: Benchmarking Cross-Lingual Consistency in Multimodal LLMs | May 21, 2025 | BenchmarkingQuestion Answering | CodeCode Available | 0 |
| Visually Dehallucinative Instruction Generation | Feb 13, 2024 | HallucinationLanguage Modeling | CodeCode Available | 0 |
| II-MMR: Identifying and Improving Multi-modal Multi-hop Reasoning in Visual Question Answering | Feb 16, 2024 | Question AnsweringTriplet | CodeCode Available | 0 |
| Treble Counterfactual VLMs: A Causal Approach to Hallucination | Mar 8, 2025 | Autonomous Drivingcounterfactual | CodeCode Available | 0 |
| Visually Grounded VQA by Lattice-based Retrieval | Nov 15, 2022 | Information RetrievalQuestion Answering | CodeCode Available | 0 |
| Securing Vision-Language Models with a Robust Encoder Against Jailbreak and Adversarial Attacks | Sep 11, 2024 | Image CaptioningQuestion Answering | CodeCode Available | 0 |
| Visually Interpretable Subtask Reasoning for Visual Question Answering | May 12, 2025 | AttributeObject Recognition | CodeCode Available | 0 |
| Barlow constrained optimization for Visual Question Answering | Mar 7, 2022 | Question AnsweringVisual Question Answering | CodeCode Available | 0 |
| BabelBench: An Omni Benchmark for Code-Driven Analysis of Multimodal and Multistructured Data | Oct 1, 2024 | Code GenerationLogical Reasoning | CodeCode Available | 0 |
| Design as Desired: Utilizing Visual Question Answering for Multimodal Pre-training | Mar 30, 2024 | Contrastive LearningQuestion Answering | CodeCode Available | 0 |
| HumaniBench: A Human-Centric Framework for Large Multimodal Models Evaluation | May 16, 2025 | BenchmarkingEthics | CodeCode Available | 0 |
| HRIBench: Benchmarking Vision-Language Models for Real-Time Human Perception in Human-Robot Interaction | Jun 25, 2025 | BenchmarkingPerson Identification | CodeCode Available | 0 |
| AVQACL: A Novel Benchmark for Audio-Visual Question Answering Continual Learning | Jan 1, 2025 | Audio-visual Question AnsweringContinual Learning | CodeCode Available | 0 |
| TUBench: Benchmarking Large Vision-Language Models on Trustworthiness with Unanswerable Questions | Oct 5, 2024 | BenchmarkingHallucination | CodeCode Available | 0 |
| Delving Deeper into Cross-lingual Visual Question Answering | Feb 15, 2022 | Inductive BiasQuestion Answering | CodeCode Available | 0 |
| Why do These Match? Explaining the Behavior of Image Similarity Models | May 26, 2019 | AttributeGeneral Classification | CodeCode Available | 0 |
| Towards Flexible Evaluation for Generative Visual Question Answering | Aug 1, 2024 | DecoderGenerative Visual Question Answering | CodeCode Available | 0 |
| Analyzing the Behavior of Visual Question Answering Models | Jun 23, 2016 | Question AnsweringVisual Question Answering | CodeCode Available | 0 |
| Select, Substitute, Search: A New Benchmark for Knowledge-Augmented Visual Question Answering | Mar 9, 2021 | Optical Character Recognition (OCR)Question Answering | CodeCode Available | 0 |
| Self-Critical Reasoning for Robust Visual Question Answering | May 24, 2019 | Question AnsweringVisual Question Answering | CodeCode Available | 0 |
| Visual Question Answering: A Survey of Methods and Datasets | Jul 20, 2016 | General KnowledgeSurvey | CodeCode Available | 0 |
| WinoGAViL: Gamified Association Benchmark to Challenge Vision-and-Language Models | Jul 25, 2022 | Common Sense ReasoningGeneral Knowledge | CodeCode Available | 0 |
| How to Determine the Preferred Image Distribution of a Black-Box Vision-Language Model? | Sep 3, 2024 | In-Context LearningLanguage Modeling | CodeCode Available | 0 |