Linguistically-aware Attention for Reducing the Semantic-Gap in Vision-Language Tasks Aug 18, 2020 Image Captioning Visual Question Answering (VQA)
— Unverified 0Linguistically Driven Graph Capsule Network for Visual Question Reasoning Mar 23, 2020 Question Answering Visual Question Answering
— Unverified 0Linguistically Routing Capsule Network for Out-of-Distribution Visual Question Answering Jan 1, 2021 Novel Concepts Question Answering
— Unverified 0LiT-4-RSVQA: Lightweight Transformer-based Visual Question Answering in Remote Sensing Jun 1, 2023 Question Answering Visual Question Answering
— Unverified 0LiVLR: A Lightweight Visual-Linguistic Reasoning Framework for Video Question Answering Nov 29, 2021 Diversity Question Answering
— Unverified 0利用图像描述与知识图谱增强表示的视觉问答(Exploiting Image Captions and External Knowledge as Representation Enhancement for Visual Question Answering) Aug 1, 2021 Image Captioning Question Answering
— Unverified 0LLaVA-Ultra: Large Chinese Language and Vision Assistant for Ultrasound Oct 19, 2024 Instruction Following Knowledge Distillation
— Unverified 0LLM4VG: Large Language Models Evaluation for Video Grounding Dec 21, 2023 Image Captioning Video Grounding
— Unverified 0Localize, Group, and Select: Boosting Text-VQA by Scene Text Modeling Aug 20, 2021 Data Ablation Optical Character Recognition
— Unverified 0Localizing Before Answering: A Hallucination Evaluation Benchmark for Grounded Medical Multimodal LLMs Apr 30, 2025 Hallucination Hallucination Evaluation
— Unverified 0Locate Then Generate: Bridging Vision and Language with Bounding Box for Scene-Text VQA Apr 4, 2023 Answer Generation Language Modelling
— Unverified 0Logically Consistent Loss for Visual Question Answering Nov 19, 2020 Multi-Task Learning Question Answering
— Unverified 0LoGra-Med: Long Context Multi-Graph Alignment for Medical Vision-Language Model Oct 3, 2024 image-classification Image Classification
— Unverified 0LOIS: Looking Out of Instance Semantics for Visual Question Answering Jul 26, 2023 Question Answering Visual Question Answering
— Unverified 0Long-Form Answers to Visual Questions from Blind and Low Vision People Aug 12, 2024 Form Visual Question Answering (VQA)
— Unverified 0Look, Learn and Leverage (L^3): Mitigating Visual-Domain Shift and Discovering Intrinsic Relations via Symbolic Alignment Aug 30, 2024 Question Answering Representation Learning
— Unverified 0Look, Read and Ask: Learning to Ask Questions by Reading Text in Images Nov 23, 2022 Optical Character Recognition (OCR) Question Answering
— Unverified 0Losing Visual Needles in Image Haystacks: Vision Language Models are Easily Distracted in Short and Long Contexts Jun 24, 2024 Mathematical Reasoning Visual Question Answering (VQA)
— Unverified 0Zero-shot and Few-shot Learning with Knowledge Graphs: A Comprehensive Survey Dec 18, 2021 Data Augmentation Few-Shot Learning
— Unverified 0LRRA:A Transparent Neural-Symbolic Reasoning Framework for Real-World Visual Question Answering Aug 1, 2021 Question Answering Visual Question Answering
— Unverified 0Can You Explain That? Lucid Explanations Help Human-AI Collaborative Image Retrieval Apr 5, 2019 Image Retrieval Question Answering
— Unverified 0Lyrics: Boosting Fine-grained Language-Vision Alignment and Comprehension via Semantic-aware Visual Objects Dec 8, 2023 Image Captioning object-detection
— Unverified 0M4CXR: Exploring Multi-task Potentials of Multi-modal Large Language Models for Chest X-ray Interpretation Aug 29, 2024 Instruction Following Medical Report Generation
— Unverified 0MAGIC-VQA: Multimodal And Grounded Inference with Commonsense Knowledge for Visual Question Answering Mar 24, 2025 Graph Neural Network Question Answering
— Unverified 0Making the V in Text-VQA Matter Aug 1, 2023 Optical Character Recognition (OCR) TextVQA
— Unverified 0Making Video Quality Assessment Models Sensitive to Frame Rate Distortions May 21, 2022 Video Quality Assessment Visual Question Answering (VQA)
— Unverified 0Making Video Quality Assessment Models Robust to Bit Depth Apr 25, 2023 Video Quality Assessment Visual Question Answering (VQA)
— Unverified 0MAMO: Masked Multimodal Modeling for Fine-Grained Vision-Language Representation Learning Oct 9, 2022 Image-text Retrieval multimodal interaction
— Unverified 0MANGO: Enhancing the Robustness of VQA Models via Adversarial Noise Generation Jan 16, 2022 Logical Reasoning Question Answering
— Unverified 0Mask4Align: Aligned Entity Prompting with Color Masks for Multi-Entity Localization Problems Jan 1, 2024 Question Answering Visual Question Answering
— Unverified 0MatCha: Enhancing Visual Language Pretraining with Math Reasoning and Chart Derendering Dec 19, 2022 Chart Question Answering Data Summarization
— Unverified 0Measuring CLEVRness: Black-box Testing of Visual Reasoning Models Sep 29, 2021 Benchmarking Diagnostic
— Unverified 0Measuring CLEVRness: Blackbox testing of Visual Reasoning Models Feb 24, 2022 Benchmarking Diagnostic
— Unverified 0Measuring Machine Intelligence Through Visual Question Answering Aug 31, 2016 Image Captioning Question Answering
— Unverified 0Med-2E3: A 2D-Enhanced 3D Medical Multimodal Large Language Model Nov 19, 2024 Language Modeling Language Modelling
— Unverified 0MedFrameQA: A Multi-Image Medical VQA Benchmark for Clinical Reasoning May 22, 2025 Diagnostic Visual Question Answering (VQA)
— Unverified 0Medical Visual Question Answering: A Survey Nov 19, 2021 Medical Visual Question Answering Question Answering
— Unverified 0Medical visual question answering using joint self-supervised learning Feb 25, 2023 Decoder Diversity
— Unverified 0MedSG-Bench: A Benchmark for Medical Image Sequences Grounding May 17, 2025 Visual Grounding Visual Question Answering (VQA)
— Unverified 0On Incorporating Semantic Prior Knowledge in Deep Learning Through Embedding-Space Constraints Sep 30, 2019 Data Augmentation Question Answering
— Unverified 0OnlineVPO: Align Video Diffusion Model with Online Video-Centric Preference Optimization Dec 19, 2024 Video Quality Assessment Visual Question Answering (VQA)
— Unverified 0On the Cognition of Visual Question Answering Models and Human Intelligence: A Comparative Study Oct 4, 2023 Question Answering Visual Question Answering
— Unverified 0On the Effects of Video Grounding on Language Models Oct 1, 2022 Image Captioning Question Answering
— Unverified 0On the Efficacy of Co-Attention Transformer Layers in Visual Question Answering Jan 11, 2022 POS Question Answering
— Unverified 0On the Flip Side: Identifying Counterexamples in Visual Question Answering Jun 3, 2018 Question Answering Visual Question Answering
— Unverified 0On the General Value of Evidence, and Bilingual Scene-Text Visual Question Answering Feb 24, 2020 Question Answering Referring Expression
— Unverified 0On the Robustness of Large Multimodal Models Against Image Adversarial Attacks Dec 6, 2023 Image Captioning image-classification
— Unverified 0On the Role of Visual Grounding in VQA Jun 26, 2024 Visual Grounding Visual Question Answering (VQA)
— Unverified 0On the Significance of Question Encoder Sequence Model in the Out-of-Distribution Performance in Visual Question Answering Aug 28, 2021 Graph Attention Question Answering
— Unverified 0On the Value of Out-of-Distribution Testing: An Example of Goodhart's Law May 19, 2020 Model Selection Question Answering
— Unverified 0