| ViewSRD: 3D Visual Grounding via Structured Multi-View Decomposition | Jul 15, 2025 | 3D visual groundingVisual Grounding | —Unverified | 0 |
| A Neural Representation Framework with LLM-Driven Spatial Reasoning for Open-Vocabulary 3D Visual Grounding | Jul 9, 2025 | 3D visual groundingAutonomous Navigation | —Unverified | 0 |
| SPAZER: Spatial-Semantic Progressive Reasoning Agent for Zero-shot 3D Visual Grounding | Jun 27, 2025 | 3D visual groundingNatural Language Queries | —Unverified | 0 |
| GroundFlow: A Plug-in Module for Temporal Reasoning on 3D Point Cloud Sequential Grounding | Jun 26, 2025 | 3D visual groundingLarge Language Model | —Unverified | 0 |
| I Speak and You Find: Robust 3D Visual Grounding with Noisy and Ambiguous Speech Inputs | Jun 17, 2025 | 3D visual groundingContrastive Learning | —Unverified | 0 |
| Unified Representation Space for 3D Visual Grounding | Jun 17, 2025 | 3D visual groundingContrastive Learning | —Unverified | 0 |
| From Objects to Anywhere: A Holistic Benchmark for Multi-level Visual Grounding in 3D Scenes | Jun 5, 2025 | 3D visual groundingObject | —Unverified | 0 |
| Zero-Shot 3D Visual Grounding from Vision-Language Models | May 28, 2025 | 3D visual groundingVisual Grounding | —Unverified | 0 |
| Extending Large Vision-Language Model for Diverse Interactive Tasks in Autonomous Driving | May 13, 2025 | 3D visual groundingAutonomous Driving | CodeCode Available | 1 |
| DenseGrounding: Improving Dense Language-Vision Semantics for Ego-Centric 3D Visual Grounding | May 8, 2025 | 3D visual groundingcross-modal alignment | —Unverified | 0 |
| AS3D: 2D-Assisted Cross-Modal Understanding with Semantic-Spatial Scene Graphs for 3D Visual Grounding | May 7, 2025 | 3D visual groundingGraph Attention | CodeCode Available | 0 |
| Ges3ViG: Incorporating Pointing Gestures into Language-Based 3D Visual Grounding for Embodied Reference Understanding | Apr 13, 2025 | 3D visual groundingData Augmentation | CodeCode Available | 0 |
| DSM: Building A Diverse Semantic Map for 3D Visual Grounding | Apr 11, 2025 | 3D visual groundingScene Understanding | —Unverified | 0 |
| ReasonGrounder: LVLM-Guided Hierarchical Feature Splatting for Open-Vocabulary 3D Visual Grounding and Reasoning | Mar 30, 2025 | 3D visual groundingFeature Splatting | —Unverified | 0 |
| NuGrounding: A Multi-View 3D Visual Grounding Framework in Autonomous Driving | Mar 28, 2025 | 3D visual groundingAutonomous Driving | —Unverified | 0 |
| Unveiling the Mist over 3D Vision-Language Understanding: Object-centric Evaluation with Chain-of-Analysis | Mar 28, 2025 | 3D Question Answering (3D-QA)3D visual grounding | CodeCode Available | 1 |
| ProxyTransformation: Preshaping Point Cloud Manifold With Proxy Attention For 3D Visual Grounding | Feb 26, 2025 | 3D visual groundingVisual Grounding | —Unverified | 0 |
| Text-guided Sparse Voxel Pruning for Efficient 3D Visual Grounding | Feb 14, 2025 | 3D Object Detection3D visual grounding | CodeCode Available | 3 |
| Evolving Symbolic 3D Visual Grounder with Weakly Supervised Reflection | Feb 3, 2025 | 3D visual groundingVisual Grounding | CodeCode Available | 1 |
| AugRefer: Advancing 3D Visual Grounding via Cross-Modal Augmentation and Spatial Relation-based Referring | Jan 16, 2025 | 3D visual groundingDecoder | —Unverified | 0 |
| ViGiL3D: A Linguistically Diverse Dataset for 3D Visual Grounding | Jan 2, 2025 | 3D visual groundingDiagnostic | —Unverified | 0 |
| Ges3ViG : Incorporating Pointing Gestures into Language-Based 3D Visual Grounding for Embodied Reference Understanding | Jan 1, 2025 | 3D visual groundingData Augmentation | CodeCode Available | 0 |
| Beyond Human Perception: Understanding Multi-Object World from Monocular View | Jan 1, 2025 | 3D visual groundingDenoising | CodeCode Available | 0 |
| 3D Spatial Understanding in MLLMs: Disambiguation and Evaluation | Dec 9, 2024 | 3D dense captioning3D visual grounding | —Unverified | 0 |
| SeeGround: See and Ground for Zero-Shot Open-Vocabulary 3D Visual Grounding | Dec 5, 2024 | 3D visual groundingObject Localization | —Unverified | 0 |
| 3D Scene Graph Guided Vision-Language Pre-training | Nov 27, 2024 | 3D dense captioning3D visual grounding | —Unverified | 0 |
| BIP3D: Bridging 2D Images and 3D Perception for Embodied Intelligence | Nov 22, 2024 | 3D visual groundingVisual Grounding | CodeCode Available | 3 |
| Solving Zero-Shot 3D Visual Grounding as Constraint Satisfaction Problems | Nov 21, 2024 | 3D visual groundingNegation | CodeCode Available | 1 |
| LidaRefer: Outdoor 3D Visual Grounding for Autonomous Driving with Transformers | Nov 7, 2024 | 3D visual groundingAutonomous Driving | —Unverified | 0 |
| Fine-Grained Spatial and Verbal Losses for 3D Visual Grounding | Nov 5, 2024 | 3D visual groundingVisual Grounding | —Unverified | 0 |
| Joint Top-Down and Bottom-Up Frameworks for 3D Visual Grounding | Oct 21, 2024 | 3D visual groundingObject | —Unverified | 0 |
| VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding | Oct 17, 2024 | 3D geometry3D visual grounding | CodeCode Available | 2 |
| Bayesian Self-Training for Semi-Supervised 3D Segmentation | Sep 12, 2024 | 3D Instance Segmentation3D Semantic Segmentation | —Unverified | 0 |
| Task-oriented Sequential Grounding in 3D Scenes | Aug 7, 2024 | 3D visual groundingVisual Grounding | —Unverified | 0 |
| RefMask3D: Language-Guided Transformer for 3D Referring Segmentation | Jul 25, 2024 | 3D visual groundingImage Segmentation | CodeCode Available | 2 |
| PD-APE: A Parallel Decoding Framework with Adaptive Position Encoding for 3D Visual Grounding | Jul 19, 2024 | 3D visual groundingAttribute | —Unverified | 0 |
| Multi-branch Collaborative Learning Network for 3D Visual Grounding | Jul 7, 2024 | 3D visual groundingReferring Expression | CodeCode Available | 1 |
| ScanReason: Empowering 3D Visual Grounding with Reasoning Capabilities | Jul 1, 2024 | 3D visual groundingLanguage Modeling | —Unverified | 0 |
| MMScan: A Multi-Modal 3D Scene Dataset with Hierarchical Grounded Language Annotations | Jun 13, 2024 | 3D visual groundingAttribute | CodeCode Available | 4 |
| Dual Attribute-Spatial Relation Alignment for 3D Visual Grounding | Jun 13, 2024 | 3D visual groundingAttribute | —Unverified | 0 |
| A Survey on Text-guided 3D Visual Grounding: Elements, Recent Advances, and Future Directions | Jun 9, 2024 | 3D visual groundingSurvey | CodeCode Available | 3 |
| Intent3D: 3D Object Detection in RGB-D Scans Based on Human Intention | May 28, 2024 | 3D Object Detection3D visual grounding | —Unverified | 0 |
| Talk to Parallel LiDARs: A Human-LiDAR Interaction Method Based on 3D Visual Grounding | May 24, 2024 | 3D visual groundingAutonomous Driving | —Unverified | 0 |
| Talk2Radar: Bridging Natural Language with 4D mmWave Radar for 3D Referring Expression Comprehension | May 21, 2024 | 3D visual groundingReferring Expression | CodeCode Available | 1 |
| Naturally Supervised 3D Visual Grounding with Language-Regularized Concept Learners | Apr 30, 2024 | 3D visual groundingVisual Grounding | —Unverified | 0 |
| Rethinking 3D Dense Caption and Visual Grounding in A Unified Framework through Prompt-based Localization | Apr 17, 2024 | 3D dense captioning3D visual grounding | CodeCode Available | 0 |
| Data-Efficient 3D Visual Grounding via Order-Aware Referring | Mar 25, 2024 | 3D visual groundingObject | —Unverified | 0 |
| SeCG: Semantic-Enhanced 3D Visual Grounding via Cross-modal Graph Attention | Mar 13, 2024 | 3D visual groundingcross-modal alignment | CodeCode Available | 0 |
| MiKASA: Multi-Key-Anchor & Scene-Aware Transformer for 3D Visual Grounding | Mar 5, 2024 | 3D visual groundingDecision Making | CodeCode Available | 1 |
| SceneVerse: Scaling 3D Vision-Language Learning for Grounded Scene Understanding | Jan 17, 2024 | 3D visual groundingScene Understanding | —Unverified | 0 |