Improved Visual Grounding through Self-Consistent Explanations Dec 7, 2023 Language Modelling Large Language Model
— Unverified 0Improving Visually Grounded Sentence Representations with Self-Attention Dec 2, 2017 Sentence Visual Grounding
— Unverified 0Individuation in Neural Models with and without Visual Grounding Sep 27, 2024 Visual Grounding
— Unverified 0Intent3D: 3D Object Detection in RGB-D Scans Based on Human Intention May 28, 2024 3D Object Detection 3D visual grounding
— Unverified 0Interactive Reinforcement Learning for Object Grounding via Self-Talking Dec 2, 2017 Object reinforcement-learning
— Unverified 0Interactive Visual Grounding of Referring Expressions for Human-Robot Interaction Jun 11, 2018 Question Generation Question-Generation
— Unverified 0Interpretable Visual Question Answering by Visual Grounding from Attention Supervision Mining Aug 1, 2018 Question Answering Visual Grounding
— Unverified 0Interpretable Visual Question Answering via Reasoning Supervision Sep 7, 2023 Common Sense Reasoning Question Answering
— Unverified 0INVIGORATE: Interactive Visual Grounding and Grasping in Clutter Aug 25, 2021 Blocking Object
— Unverified 0I Speak and You Find: Robust 3D Visual Grounding with Noisy and Ambiguous Speech Inputs Jun 17, 2025 3D visual grounding Contrastive Learning
— Unverified 0Joint Top-Down and Bottom-Up Frameworks for 3D Visual Grounding Oct 21, 2024 3D visual grounding Object
— Unverified 0Knowledge Supports Visual Language Grounding: A Case Study on Colour Terms Jul 1, 2020 Diagnostic Object
— Unverified 0Language-Guided 3D Object Detection in Point Cloud for Autonomous Driving May 25, 2023 3D Object Detection Autonomous Driving
— Unverified 0Language learning using Speech to Image retrieval Sep 9, 2019 Grounded language learning Image Retrieval
— Unverified 0LanguageRefer: Spatial-Language Model for 3D Visual Grounding Jul 7, 2021 3D visual grounding Language Modeling
— Unverified 0LCV2: An Efficient Pretraining-Free Framework for Grounded Visual Question Answering Jan 29, 2024 Language Modeling Language Modelling
— Unverified 0Learning from Synthetic Data for Visual Grounding Mar 20, 2024 Language Modelling Large Language Model
— Unverified 0Visually Consistent Hierarchical Image Classification Jun 17, 2024 Classification image-classification
— Unverified 0Learning Language Structures through Grounding Jun 14, 2024 Automatic Speech Recognition Dependency Parsing
— Unverified 0Learning to Compose and Reason with Language Tree Structures for Visual Grounding Jun 5, 2019 Visual Grounding Visual Reasoning
— Unverified 0Learning to Ground VLMs without Forgetting Oct 14, 2024 Decoder Language Modelling
— Unverified 0Learning Unsupervised Visual Grounding Through Semantic Self-Supervision Mar 17, 2018 Visual Grounding
— Unverified 0Learning Visual Grounding from Generative Vision and Language Model Jul 18, 2024 Attribute Language Modeling
— Unverified 0Learning with Difference Attention for Visually Grounded Self-supervised Representations Jun 26, 2023 Self-Supervised Learning Visual Grounding
— Unverified 0Less is More: Generating Grounded Navigation Instructions from Landmarks Nov 25, 2021 Decoder Instruction Following
— Unverified 0Leveraging Multimodal-LLMs Assisted by Instance Segmentation for Intelligent Traffic Monitoring Feb 16, 2025 Instance Segmentation Language Modeling
— Unverified 0Leveraging Past References for Robust Language Grounding Nov 1, 2019 Object Referring Expression
— Unverified 0LidaRefer: Outdoor 3D Visual Grounding for Autonomous Driving with Transformers Nov 7, 2024 3D visual grounding Autonomous Driving
— Unverified 0Lightweight In-Context Tuning for Multimodal Unified Models Oct 8, 2023 Image Captioning In-Context Learning
— Unverified 0Like a bilingual baby: The advantage of visually grounding a bilingual language model Oct 11, 2022 Language Modeling Language Modelling
— Unverified 0LLM-Optic: Unveiling the Capabilities of Large Language Models for Universal Visual Grounding May 27, 2024 Visual Grounding
— Unverified 0LQMFormer: Language-aware Query Mask Transformer for Referring Image Segmentation Jan 1, 2024 Image Segmentation Semantic Segmentation
— Unverified 0M4CXR: Exploring Multi-task Potentials of Multi-modal Large Language Models for Chest X-ray Interpretation Aug 29, 2024 Instruction Following Medical Report Generation
— Unverified 0MAMO: Masked Multimodal Modeling for Fine-Grained Vision-Language Representation Learning Oct 9, 2022 Image-text Retrieval multimodal interaction
— Unverified 0Medical Phrase Grounding with Region-Phrase Context Contrastive Alignment Mar 14, 2023 Medical Image Analysis Phrase Grounding
— Unverified 0MedRG: Medical Report Grounding with Multi-modal Large Language Model Apr 10, 2024 Decoder Language Modeling
— Unverified 0MedSG-Bench: A Benchmark for Medical Image Sequences Grounding May 17, 2025 Visual Grounding Visual Question Answering (VQA)
— Unverified 0Mitigating Hallucination in Large Vision-Language Models via Adaptive Attention Calibration May 27, 2025 Hallucination Visual Grounding
— Unverified 0MMR: Evaluating Reading Ability of Large Multimodal Models Aug 26, 2024 Font Recognition MMR total
— Unverified 0MNER-QG: An End-to-End MRC framework for Multimodal Named Entity Recognition with Query Grounding Nov 27, 2022 named-entity-recognition Named Entity Recognition
— Unverified 0MoDA: Modulation Adapter for Fine-Grained Visual Grounding in Instructional MLLMs Jun 2, 2025 Instruction Following Text Generation
— Unverified 0More Thinking, Less Seeing? Assessing Amplified Hallucination in Multimodal Reasoning Models May 23, 2025 Diagnostic Hallucination
— Unverified 0Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level Nov 15, 2024 Benchmarking counterfactual
— Unverified 0Movie Box Office Prediction With Self-Supervised and Visually Grounded Pretraining Apr 20, 2023 Visual Grounding
— Unverified 0mRAG: Elucidating the Design Space of Multi-modal Retrieval-Augmented Generation May 29, 2025 Question Answering RAG
— Unverified 0Multi-Granularity Modularized Network for Abstract Visual Reasoning Jul 9, 2020 Visual Grounding Visual Reasoning
— Unverified 0Multimodal Reference Visual Grounding Apr 2, 2025 Few-Shot Object Detection Visual Grounding
— Unverified 0Multimodal Unified Attention Networks for Vision-and-Language Interactions Aug 12, 2019 Question Answering Visual Grounding
— Unverified 0Multi-task Learning of Hierarchical Vision-Language Representation Dec 3, 2018 Multi-Task Learning Question Answering
— Unverified 0NanoMVG: USV-Centric Low-Power Multi-Task Visual Grounding based on Prompt-Guided Camera and 4D mmWave Radar Aug 30, 2024 Autonomous Driving Visual Grounding
— Unverified 0