MLLM-For3D: Adapting Multimodal Large Language Model for 3D Reasoning Segmentation Mar 23, 2025 Language Modeling Language Modelling
— Unverified 0PanopticSplatting: End-to-End Panoptic Gaussian Splatting Mar 23, 2025 global-optimization NeRF
— Unverified 0Geometric Constrained Non-Line-of-Sight Imaging Mar 23, 2025 Scene Understanding Surface Reconstruction
— Unverified 0PanoGS: Gaussian-based Panoptic Segmentation for 3D Open Vocabulary Scene Understanding Mar 23, 2025 3DGS Decoder
— Unverified 0ClaraVid: A Holistic Scene Reconstruction Benchmark From Aerial Perspective With Delentropy-Based Complexity Profiling Mar 22, 2025 Panoptic Segmentation Scene Understanding
— Unverified 0ExCap3D: Expressive 3D Scene Understanding via Object Captioning with Varying Detail Mar 21, 2025 Object Scene Understanding
— Unverified 0From Monocular Vision to Autonomous Action: Guiding Tumor Resection via 3D Reconstruction Mar 20, 2025 3D Reconstruction Anatomy
— Unverified 0SemanticFlow: A Self-Supervised Framework for Joint Scene Flow Prediction and Instance Segmentation in Dynamic Environments Mar 19, 2025 Autonomous Driving Computational Efficiency
— Unverified 0These Magic Moments: Differentiable Uncertainty Quantification of Radiance Field Models Mar 18, 2025 Decision Making Scene Understanding
— Unverified 0PSA-SSL: Pose and Size-aware Self-Supervised Learning on LiDAR Point Clouds Mar 18, 2025 3D Object Detection 3D Semantic Segmentation
Code Code Available 0ChatBEV: A Visual Language Model that Understands BEV Maps Mar 18, 2025 Autonomous Driving Language Modeling
— Unverified 0Learning-based 3D Reconstruction in Autonomous Driving: A Comprehensive Survey Mar 17, 2025 3D Reconstruction Autonomous Driving
— Unverified 0HIS-GPT: Towards 3D Human-In-Scene Multimodal Understanding Mar 17, 2025 Question Answering Scene Understanding
— Unverified 0EgoSplat: Open-Vocabulary Egocentric Scene Understanding with Language Embedded 3D Gaussian Splatting Mar 14, 2025 Scene Understanding Segmentation
— Unverified 0Road Rage Reasoning with Vision-language Models (VLMs): Task Definition and Evaluation Dataset Mar 14, 2025 Scene Understanding
— Unverified 0TARS: Traffic-Aware Radar Scene Flow Estimation Mar 13, 2025 Autonomous Driving object-detection
— Unverified 0TGP: Two-modal occupancy prediction with 3D Gaussian and sparse points for 3D Environment Awareness Mar 13, 2025 Autonomous Driving Prediction
— Unverified 0Graph-Grounded LLMs: Leveraging Graphical Function Calling to Minimize LLM Hallucinations Mar 13, 2025 Autonomous Vehicles Knowledge Graphs
— Unverified 0Object-Aware DINO (Oh-A-Dino): Enhancing Self-Supervised Representations for Multi-Object Instance Retrieval Mar 12, 2025 Object Retrieval
— Unverified 0MaskAttn-UNet: A Mask Attention-Driven Framework for Universal Low-Resolution Image Segmentation Mar 11, 2025 Image Segmentation Panoptic Segmentation
— Unverified 0DIV-FF: Dynamic Image-Video Feature Fields For Environment Understanding in Egocentric Videos Mar 11, 2025 Scene Understanding
— Unverified 0Generating Robot Constitutions & Benchmarks for Semantic Safety Mar 11, 2025 Collision Avoidance Image Generation
— Unverified 0General-Purpose Aerial Intelligent Agents Empowered by Large Language Models Mar 11, 2025 Motion Planning Scene Understanding
— Unverified 0LLaFEA: Frame-Event Complementary Fusion for Fine-Grained Spatiotemporal Understanding in LMMs Mar 10, 2025 Position Scene Understanding
— Unverified 0CoT-Drive: Efficient Motion Forecasting for Autonomous Driving with LLMs and Chain-of-Thought Prompting Mar 10, 2025 Autonomous Driving Knowledge Distillation
— Unverified 0Towards Ambiguity-Free Spatial Foundation Model: Rethinking and Decoupling Depth Ambiguity Mar 8, 2025 Depth Estimation Scene Understanding
Code Code Available 0Segment Anything, Even Occluded Mar 8, 2025 Amodal Instance Segmentation Autonomous Driving
— Unverified 0Feature-EndoGaussian: Feature Distilled Gaussian Splatting in Surgical Deformable Scene Reconstruction Mar 8, 2025 3DGS image-classification
— Unverified 0SplatTalk: 3D VQA with Gaussian Splatting Mar 8, 2025 3DGS Question Answering
— Unverified 0EvidMTL: Evidential Multi-Task Learning for Uncertainty-Aware Semantic Surface Mapping from Monocular RGB Images Mar 6, 2025 Depth Estimation Depth Prediction
— Unverified 0Vision-Language Models Struggle to Align Entities across Modalities Mar 5, 2025 Attribute Code Generation
— Unverified 0Improving 6D Object Pose Estimation of metallic Household and Industry Objects Mar 5, 2025 6D Pose Estimation using RGB Pose Estimation
— Unverified 0SurgiSAM2: Fine-tuning a foundational model for surgical video anatomy segmentation and detection Mar 5, 2025 Anatomy Scene Segmentation
— Unverified 0Label-Efficient LiDAR Panoptic Segmentation Mar 4, 2025 Instance Segmentation Panoptic Segmentation
— Unverified 0SSNet: Saliency Prior and State Space Model-based Network for Salient Object Detection in RGB-D Images Mar 4, 2025 object-detection Object Detection
— Unverified 0Every SAM Drop Counts: Embracing Semantic Priors for Multi-Modality Image Fusion and Beyond Mar 3, 2025 Infrared And Visible Image Fusion Scene Understanding
— Unverified 0vS-Graphs: Integrating Visual SLAM and Situational Graphs through Multi-level Scene Understanding Mar 3, 2025 Scene Understanding Simultaneous Localization and Mapping
— Unverified 0OpenGS-SLAM: Open-Set Dense Semantic SLAM with 3D Gaussian Splatting for Object-Level Scene Understanding Mar 3, 2025 Scene Understanding Semantic SLAM
— Unverified 0Floorplan-SLAM: A Real-Time, High-Accuracy, and Long-Term Multi-Session Point-Plane SLAM for Efficient Floorplan Reconstruction Mar 1, 2025 GPU Pose Estimation
— Unverified 0VLM-E2E: Enhancing End-to-End Autonomous Driving with Multimodal Driver Attention Fusion Feb 25, 2025 Autonomous Driving Navigate
— Unverified 0AAD-LLM: Neural Attention-Driven Auditory Scene Understanding Feb 24, 2025 Question Answering Response Generation
— Unverified 0Dr. Splat: Directly Referring 3D Gaussian Splatting via Direct Language Embedding Registration Feb 23, 2025 3DGS 3D Semantic Segmentation
— Unverified 0Hierarchical Context Transformer for Multi-level Semantic Scene Understanding Feb 21, 2025 Contrastive Learning Representation Learning
Code Code Available 0AVD2: Accident Video Diffusion for Accident Video Description Feb 20, 2025 Autonomous Driving Scene Understanding
— Unverified 0Sce2DriveX: A Generalized MLLM Framework for Scene-to-Drive Learning Feb 19, 2025 Autonomous Driving Bench2Drive
— Unverified 0Understanding and Evaluating Hallucinations in 3D Visual Language Models Feb 18, 2025 Diversity Scene Understanding
— Unverified 0Surgical Scene Understanding in the Era of Foundation AI Models: A Comprehensive Review Feb 16, 2025 Scene Understanding
— Unverified 03D-Grounded Vision-Language Framework for Robotic Task Planning: Automated Prompt Synthesis and Supervised Reasoning Feb 13, 2025 Code Generation Scene Understanding
— Unverified 0FLARES: Fast and Accurate LiDAR Multi-Range Semantic Segmentation Feb 13, 2025 Autonomous Driving LIDAR Semantic Segmentation
— Unverified 0sshELF: Single-Shot Hierarchical Extrapolation of Latent Features for 3D Reconstruction from Sparse-Views Feb 6, 2025 3D Reconstruction 3D Scene Reconstruction
— Unverified 0