SOTAVerified

3D visual grounding

Papers

Showing 5182 of 82 papers

TitleStatusHype
Towards CLIP-driven Language-free 3D Visual Grounding via 2D-3D Relational Enhancement and ConsistencyCode0
Multi-Attribute Interactions Matter for 3D Visual GroundingCode0
G^3-LQ: Marrying Hyperbolic Alignment with Explicit Semantic-Geometric Modeling for 3D Visual Grounding0
Viewpoint-Aware Visual Grounding in 3D Scenes0
Weakly-Supervised 3D Visual Grounding based on Visual Linguistic Alignment0
Mono3DVG: 3D Visual Grounding in Monocular ImagesCode1
Visual Programming for Zero-shot Open-Vocabulary 3D Visual GroundingCode1
CityRefer: Geography-aware 3D Visual Grounding Dataset on City-scale Point Cloud DataCode1
CoT3DRef: Chain-of-Thoughts Data-Efficient 3D Visual GroundingCode1
LLM-Grounder: Open-Vocabulary 3D Visual Grounding with Large Language Model as an AgentCode2
Multi3DRefer: Grounding Text Description to Multiple 3D ObjectsCode1
Four Ways to Improve Verbo-visual Fusion for Dense 3D Visual Grounding0
3DRP-Net: 3D Relative Position-aware Network for 3D Visual Grounding0
Distilling Coarse-to-Fine Semantic Matching Knowledge for Weakly Supervised 3D Visual GroundingCode1
Cross3DVG: Cross-Dataset 3D Visual Grounding on Different RGB-D ScansCode1
WildRefer: 3D Object Localization in Large-scale Dynamic Scenes with Multi-modal Visual Data and Natural LanguageCode0
ViewRefer: Grasp the Multi-view Knowledge for 3D Visual Grounding with GPT and Prototype GuidanceCode1
ScanERU: Interactive 3D Visual Grounding based on Embodied Reference UnderstandingCode0
ViewRefer: Grasp the Multi-view Knowledge for 3D Visual Grounding0
Context-Aware Alignment and Mutual Masking for 3D-Language Pre-TrainingCode1
UniT3D: A Unified Transformer for 3D Dense Captioning and Visual Grounding0
Look Around and Refer: 2D Synthetic Semantics Knowledge Distillation for 3D Visual GroundingCode1
Learning Point-Language Hierarchical Alignment for 3D Visual GroundingCode1
EDA: Explicit Text-Decoupling and Dense Alignment for 3D Visual GroundingCode1
3D-SPS: Single-Stage 3D Visual Grounding via Referred Point Progressive SelectionCode1
Multi-View Transformer for 3D Visual GroundingCode1
D3Net: A Unified Speaker-Listener Architecture for 3D Dense Captioning and Visual Grounding0
TransRefer3D: Entity-and-Relation Aware Transformer for Fine-Grained 3D Visual Grounding0
LanguageRefer: Spatial-Language Model for 3D Visual Grounding0
SAT: 2D Semantics Assisted Training for 3D Visual GroundingCode1
Refer-it-in-RGBD: A Bottom-up Approach for 3D Visual Grounding in RGBD ImagesCode1
InstanceRefer: Cooperative Holistic Understanding for Visual Grounding on Point Clouds through Instance Multi-level Contextual ReferringCode1
Show:102550
← PrevPage 2 of 2Next →

No leaderboard results yet.