| MMScan: A Multi-Modal 3D Scene Dataset with Hierarchical Grounded Language Annotations | Jun 13, 2024 | 3D visual groundingAttribute | CodeCode Available | 4 | 5 |
| BIP3D: Bridging 2D Images and 3D Perception for Embodied Intelligence | Nov 22, 2024 | 3D visual groundingVisual Grounding | CodeCode Available | 3 | 5 |
| A Survey on Text-guided 3D Visual Grounding: Elements, Recent Advances, and Future Directions | Jun 9, 2024 | 3D visual groundingSurvey | CodeCode Available | 3 | 5 |
| Text-guided Sparse Voxel Pruning for Efficient 3D Visual Grounding | Feb 14, 2025 | 3D Object Detection3D visual grounding | CodeCode Available | 3 | 5 |
| RefMask3D: Language-Guided Transformer for 3D Referring Segmentation | Jul 25, 2024 | 3D visual groundingImage Segmentation | CodeCode Available | 2 | 5 |
| LLM-Grounder: Open-Vocabulary 3D Visual Grounding with Large Language Model as an Agent | Sep 21, 2023 | 3D visual groundingLanguage Modeling | CodeCode Available | 2 | 5 |
| VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding | Oct 17, 2024 | 3D geometry3D visual grounding | CodeCode Available | 2 | 5 |
| CoT3DRef: Chain-of-Thoughts Data-Efficient 3D Visual Grounding | Oct 10, 2023 | 3D visual groundingVisual Grounding | CodeCode Available | 1 | 5 |
| CityRefer: Geography-aware 3D Visual Grounding Dataset on City-scale Point Cloud Data | Oct 28, 2023 | 3D visual groundingAutonomous Vehicles | CodeCode Available | 1 | 5 |
| SAT: 2D Semantics Assisted Training for 3D Visual Grounding | May 24, 2021 | 3D visual groundingObject | CodeCode Available | 1 | 5 |
| Evolving Symbolic 3D Visual Grounder with Weakly Supervised Reflection | Feb 3, 2025 | 3D visual groundingVisual Grounding | CodeCode Available | 1 | 5 |
| Context-Aware Alignment and Mutual Masking for 3D-Language Pre-Training | Jan 1, 2023 | 3D dense captioning3D visual grounding | CodeCode Available | 1 | 5 |
| EDA: Explicit Text-Decoupling and Dense Alignment for 3D Visual Grounding | Sep 29, 2022 | 3D visual groundingObject | CodeCode Available | 1 | 5 |
| Cross3DVG: Cross-Dataset 3D Visual Grounding on Different RGB-D Scans | May 23, 2023 | 3D Reconstruction3D visual grounding | CodeCode Available | 1 | 5 |
| Mono3DVG: 3D Visual Grounding in Monocular Images | Dec 13, 2023 | 3D Object Detection3D visual grounding | CodeCode Available | 1 | 5 |
| Multi-View Transformer for 3D Visual Grounding | Apr 5, 2022 | 3D visual groundingVisual Grounding | CodeCode Available | 1 | 5 |
| Learning Point-Language Hierarchical Alignment for 3D Visual Grounding | Oct 22, 2022 | 3D visual groundingSentence | CodeCode Available | 1 | 5 |
| Distilling Coarse-to-Fine Semantic Matching Knowledge for Weakly Supervised 3D Visual Grounding | Jul 18, 2023 | 3D visual groundingObject | CodeCode Available | 1 | 5 |
| MiKASA: Multi-Key-Anchor & Scene-Aware Transformer for 3D Visual Grounding | Mar 5, 2024 | 3D visual groundingDecision Making | CodeCode Available | 1 | 5 |
| Look Around and Refer: 2D Synthetic Semantics Knowledge Distillation for 3D Visual Grounding | Nov 25, 2022 | 3D visual groundingKnowledge Distillation | CodeCode Available | 1 | 5 |
| InstanceRefer: Cooperative Holistic Understanding for Visual Grounding on Point Clouds through Instance Multi-level Contextual Referring | Mar 1, 2021 | 3D visual groundingAttribute | CodeCode Available | 1 | 5 |
| 3D-SPS: Single-Stage 3D Visual Grounding via Referred Point Progressive Selection | Apr 13, 2022 | 3D visual groundingVisual Grounding | CodeCode Available | 1 | 5 |
| Multi3DRefer: Grounding Text Description to Multiple 3D Objects | Sep 11, 2023 | 3D visual groundingContrastive Learning | CodeCode Available | 1 | 5 |
| Extending Large Vision-Language Model for Diverse Interactive Tasks in Autonomous Driving | May 13, 2025 | 3D visual groundingAutonomous Driving | CodeCode Available | 1 | 5 |
| Multi-branch Collaborative Learning Network for 3D Visual Grounding | Jul 7, 2024 | 3D visual groundingReferring Expression | CodeCode Available | 1 | 5 |