SOTAVerified

3D visual grounding

Papers

Showing 2650 of 82 papers

TitleStatusHype
InstanceRefer: Cooperative Holistic Understanding for Visual Grounding on Point Clouds through Instance Multi-level Contextual ReferringCode1
3D-SPS: Single-Stage 3D Visual Grounding via Referred Point Progressive SelectionCode1
Distilling Coarse-to-Fine Semantic Matching Knowledge for Weakly Supervised 3D Visual GroundingCode1
Visual Programming for Zero-shot Open-Vocabulary 3D Visual GroundingCode1
Look Around and Refer: 2D Synthetic Semantics Knowledge Distillation for 3D Visual GroundingCode1
MiKASA: Multi-Key-Anchor & Scene-Aware Transformer for 3D Visual GroundingCode1
Zero-Shot 3D Visual Grounding from Vision-Language Models0
3D Scene Graph Guided Vision-Language Pre-training0
3D Spatial Understanding in MLLMs: Disambiguation and Evaluation0
A Neural Representation Framework with LLM-Driven Spatial Reasoning for Open-Vocabulary 3D Visual Grounding0
AugRefer: Advancing 3D Visual Grounding via Cross-Modal Augmentation and Spatial Relation-based Referring0
Bayesian Self-Training for Semi-Supervised 3D Segmentation0
D3Net: A Unified Speaker-Listener Architecture for 3D Dense Captioning and Visual Grounding0
DenseGrounding: Improving Dense Language-Vision Semantics for Ego-Centric 3D Visual Grounding0
Data-Efficient 3D Visual Grounding via Order-Aware Referring0
DSM: Building A Diverse Semantic Map for 3D Visual Grounding0
Dual Attribute-Spatial Relation Alignment for 3D Visual Grounding0
ScanReason: Empowering 3D Visual Grounding with Reasoning Capabilities0
Fine-Grained Spatial and Verbal Losses for 3D Visual Grounding0
From Objects to Anywhere: A Holistic Benchmark for Multi-level Visual Grounding in 3D Scenes0
G^3-LQ: Marrying Hyperbolic Alignment with Explicit Semantic-Geometric Modeling for 3D Visual Grounding0
GroundFlow: A Plug-in Module for Temporal Reasoning on 3D Point Cloud Sequential Grounding0
Intent3D: 3D Object Detection in RGB-D Scans Based on Human Intention0
I Speak and You Find: Robust 3D Visual Grounding with Noisy and Ambiguous Speech Inputs0
Joint Top-Down and Bottom-Up Frameworks for 3D Visual Grounding0
Show:102550
← PrevPage 2 of 4Next →

No leaderboard results yet.