NAVER: A Neuro-Symbolic Compositional Automaton for Visual Grounding with Explicit Logic Reasoning Feb 1, 2025 Referring Expression Visual Grounding
Code Code Available 1Panoptic Narrative Grounding Sep 10, 2021 Natural Language Visual Grounding Panoptic Segmentation
Code Code Available 1Visual Grounding in Video for Unsupervised Word Translation Mar 11, 2020 Translation Visual Grounding
Code Code Available 1Visual Grounding of Learned Physical Models Apr 28, 2020 Visual Grounding
Code Code Available 1EAGLE: Enhanced Visual Grounding Minimizes Hallucinations in Instructional Multimodal Models Jan 6, 2025 Hallucination Visual Grounding
— Unverified 0Dynamic MDETR: A Dynamic Multimodal Transformer Decoder for Visual Grounding Sep 28, 2022 Decoder Visual Grounding
— Unverified 0Dynamic Inference With Grounding Based Vision and Language Models Jan 1, 2023 Language Modelling Referring Expression
— Unverified 0Bridging Modality Gap for Visual Grounding with Effecitve Cross-modal Distillation Dec 29, 2023 Visual Grounding
— Unverified 0A Neural Representation Framework with LLM-Driven Spatial Reasoning for Open-Vocabulary 3D Visual Grounding Jul 9, 2025 3D visual grounding Autonomous Navigation
— Unverified 0Movie Box Office Prediction With Self-Supervised and Visually Grounded Pretraining Apr 20, 2023 Visual Grounding
— Unverified 0LanguageRefer: Spatial-Language Model for 3D Visual Grounding Jul 7, 2021 3D visual grounding Language Modeling
— Unverified 0ACTRESS: Active Retraining for Semi-supervised Visual Grounding Jul 3, 2024 Binary Classification Visual Grounding
— Unverified 0More Thinking, Less Seeing? Assessing Amplified Hallucination in Multimodal Reasoning Models May 23, 2025 Diagnostic Hallucination
— Unverified 0MNER-QG: An End-to-End MRC framework for Multimodal Named Entity Recognition with Query Grounding Nov 27, 2022 named-entity-recognition Named Entity Recognition
— Unverified 0BlenderAlchemy: Editing 3D Graphics with Vision-Language Models Apr 26, 2024 Game Design Image Generation
— Unverified 0MoDA: Modulation Adapter for Fine-Grained Visual Grounding in Instructional MLLMs Jun 2, 2025 Instruction Following Text Generation
— Unverified 0Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level Nov 15, 2024 Benchmarking counterfactual
— Unverified 0Data-Efficient 3D Visual Grounding via Order-Aware Referring Mar 25, 2024 3D visual grounding Object
— Unverified 0Joint Top-Down and Bottom-Up Frameworks for 3D Visual Grounding Oct 21, 2024 3D visual grounding Object
— Unverified 0Don't Look Only Once: Towards Multimodal Interactive Reasoning with Selective Visual Revisitation May 24, 2025 Mathematical Reasoning Multimodal Reasoning
— Unverified 0I Speak and You Find: Robust 3D Visual Grounding with Noisy and Ambiguous Speech Inputs Jun 17, 2025 3D visual grounding Contrastive Learning
— Unverified 0Does Your 3D Encoder Really Work? When Pretrain-SFT from 2D VLMs Meets 3D VLMs Jun 5, 2025 cross-modal alignment Dense Captioning
— Unverified 0Beyond Object Categories: Multi-Attribute Reference Understanding for Visual Grounding Mar 25, 2025 Attribute Object
— Unverified 0MMR: Evaluating Reading Ability of Large Multimodal Models Aug 26, 2024 Font Recognition MMR total
— Unverified 0A Systematic Evaluation of GPT-4V's Multimodal Capability for Medical Image Analysis Oct 31, 2023 Descriptive Medical Image Analysis
— Unverified 03D Spatial Understanding in MLLMs: Disambiguation and Evaluation Dec 9, 2024 3D dense captioning 3D visual grounding
— Unverified 0Interpretable Visual Question Answering via Reasoning Supervision Sep 7, 2023 Common Sense Reasoning Question Answering
— Unverified 0Mitigating Hallucination in Large Vision-Language Models via Adaptive Attention Calibration May 27, 2025 Hallucination Visual Grounding
— Unverified 0Interpretable Visual Question Answering by Visual Grounding from Attention Supervision Mining Aug 1, 2018 Question Answering Visual Grounding
— Unverified 0Interactive Visual Grounding of Referring Expressions for Human-Robot Interaction Jun 11, 2018 Question Generation Question-Generation
— Unverified 0Differentiable Parsing and Visual Grounding of Natural Language Instructions for Object Placement Oct 1, 2022 Graph Neural Network Object
— Unverified 0INVIGORATE: Interactive Visual Grounding and Grasping in Clutter Aug 25, 2021 Blocking Object
— Unverified 0Interactive Reinforcement Learning for Object Grounding via Self-Talking Dec 2, 2017 Object reinforcement-learning
— Unverified 0Intent3D: 3D Object Detection in RGB-D Scans Based on Human Intention May 28, 2024 3D Object Detection 3D visual grounding
— Unverified 0Differentiable Disentanglement Filter: an Application Agnostic Core Concept Discovery Probe Sep 4, 2019 Disentanglement Visual Grounding
— Unverified 0Align2Ground: Weakly Supervised Phrase Grounding Guided by Image-Caption Alignment Mar 27, 2019 Image Retrieval Phrase Grounding
— Unverified 0Differentiable Disentanglement Filter: an Application Agnostic Core Concept Discovery Probe Jul 17, 2019 Disentanglement Visual Grounding
— Unverified 0Knowledge Supports Visual Language Grounding: A Case Study on Colour Terms Jul 1, 2020 Diagnostic Object
— Unverified 0Benchmarking Diverse-Modal Entity Linking with Generative Models May 27, 2023 Benchmarking Decoder
— Unverified 0AIFit: Automatic 3D Human-Interpretable Feedback Models for Fitness Training Jun 19, 2021 Visual Grounding
— Unverified 0Individuation in Neural Models with and without Visual Grounding Sep 27, 2024 Visual Grounding
— Unverified 0Language-Guided 3D Object Detection in Point Cloud for Autonomous Driving May 25, 2023 3D Object Detection Autonomous Driving
— Unverified 0Detecting Concrete Visual Tokens for Multimodal Machine Translation Mar 5, 2024 Machine Translation Multimodal Machine Translation
— Unverified 0Being data-driven is not enough: Revisiting interactive instruction giving as a challenge for NLG Nov 1, 2018 Text Generation Visual Grounding
— Unverified 0MedSG-Bench: A Benchmark for Medical Image Sequences Grounding May 17, 2025 Visual Grounding Visual Question Answering (VQA)
— Unverified 0DSM: Building A Diverse Semantic Map for 3D Visual Grounding Apr 11, 2025 3D visual grounding Scene Understanding
— Unverified 0LCV2: An Efficient Pretraining-Free Framework for Grounded Visual Question Answering Jan 29, 2024 Language Modeling Language Modelling
— Unverified 0Dual Attribute-Spatial Relation Alignment for 3D Visual Grounding Jun 13, 2024 3D visual grounding Attribute
— Unverified 0Improving Visually Grounded Sentence Representations with Self-Attention Dec 2, 2017 Sentence Visual Grounding
— Unverified 0DenseGrounding: Improving Dense Language-Vision Semantics for Ego-Centric 3D Visual Grounding May 8, 2025 3D visual grounding cross-modal alignment
— Unverified 0