| Compact Trilinear Interaction for Visual Question Answering | Sep 26, 2019 | BenchmarkingKnowledge Distillation | CodeCode Available | 0 |
| TCC-Bench: Benchmarking the Traditional Chinese Culture Understanding Capabilities of MLLMs | May 16, 2025 | BenchmarkingQuestion Answering | CodeCode Available | 0 |
| Making the V in VQA Matter: Elevating the Role of Image Understanding in Visual Question Answering | Dec 2, 2016 | Visual Question AnsweringVisual Question Answering (VQA) | CodeCode Available | 0 |
| LXMERT Model Compression for Visual Question Answering | Oct 23, 2023 | modelModel Compression | CodeCode Available | 0 |
| Perceptual Score: What Data Modalities Does Your Model Perceive? | Oct 27, 2021 | Question AnsweringVisual Dialog | CodeCode Available | 0 |
| Enhancing Vietnamese VQA through Curriculum Learning on Raw and Augmented Text Representations | Mar 5, 2025 | Question AnsweringVisual Question Answering | CodeCode Available | 0 |
| TeamLoRA: Boosting Low-Rank Adaptation with Expert Collaboration and Competition | Aug 19, 2024 | GPUMulti-Task Learning | CodeCode Available | 0 |
| LPF: A Language-Prior Feedback Objective Function for De-biased Visual Question Answering | May 29, 2021 | Question AnsweringVisual Question Answering | CodeCode Available | 0 |
| Enhancing Cross-Prompt Transferability in Vision-Language Models through Contextual Injection of Target Tokens | Jun 19, 2024 | Caption Generationimage-classification | CodeCode Available | 0 |
| CommVQA: Situating Visual Question Answering in Communicative Contexts | Feb 22, 2024 | Question AnsweringVisual Question Answering | CodeCode Available | 0 |
| COLUMBUS: Evaluating COgnitive Lateral Understanding through Multiple-choice reBUSes | Sep 6, 2024 | Multiple-choiceQuestion Answering | CodeCode Available | 0 |
| Lost in Space: Probing Fine-grained Spatial Understanding in Vision and Language Resamplers | Apr 21, 2024 | DiagnosticImage Captioning | CodeCode Available | 0 |
| PitVQA++: Vector Matrix-Low-Rank Adaptation for Open-Ended Visual Question Answering in Pituitary Surgery | Feb 19, 2025 | Question AnsweringVisual Question Answering | CodeCode Available | 0 |
| Temporal Reasoning via Audio Question Answering | Nov 21, 2019 | Audio Question AnsweringDiagnostic | CodeCode Available | 0 |
| Loss re-scaling VQA: Revisiting the LanguagePrior Problem from a Class-imbalance View | Oct 30, 2020 | Face Recognitionimage-classification | CodeCode Available | 0 |
| Logical Implications for Visual Question Answering Consistency | Mar 16, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| p-Laplacian Adaptation for Generative Pre-trained Vision-Language Models | Dec 17, 2023 | Image CaptioningQuestion Answering | CodeCode Available | 0 |
| Enhancing Continual Learning in Visual Question Answering with Modality-Aware Feature Distillation | Jun 27, 2024 | Continual LearningQuestion Answering | CodeCode Available | 0 |
| Locally Smoothed Neural Networks | Nov 22, 2017 | Face VerificationQuestion Answering | CodeCode Available | 0 |
| Plug-and-Play VQA: Zero-shot VQA by Conjoining Large Pretrained Models with Zero Training | Oct 17, 2022 | Image CaptioningNetwork Interpretation | CodeCode Available | 0 |
| LLM-Assisted Multi-Teacher Continual Learning for Visual Question Answering in Robotic Surgery | Feb 26, 2024 | Continual LearningExemplar-Free | CodeCode Available | 0 |
| LLaVA-OneVision: Easy Visual Task Transfer | Aug 6, 2024 | 3D Question Answering (3D-QA) | CodeCode Available | 0 |
| VizWiz Grand Challenge: Answering Visual Questions from Blind People | Feb 22, 2018 | Question AnsweringVisual Question Answering | CodeCode Available | 0 |
| A Unified Hallucination Mitigation Framework for Large Vision-Language Models | Sep 24, 2024 | HallucinationQuestion Answering | CodeCode Available | 0 |
| LININ: Logic Integrated Neural Inference Network for Explanatory Visual Question Answering | Dec 24, 2024 | Explanatory Visual Question AnsweringMultimodal Reasoning | CodeCode Available | 0 |