ALFWorld: Aligning Text and Embodied Environments for Interactive Learning Oct 8, 2020 Natural Language Visual Grounding Scene Understanding
Code Code Available 15 Cerberus Transformer: Joint Semantic, Affordance and Attribute Parsing Nov 24, 2021 Attribute Scene Understanding
Code Code Available 15 Mitigating Trade-off: Stream and Query-guided Aggregation for Efficient and Effective 3D Occupancy Prediction Mar 28, 2025 Autonomous Driving Scene Understanding
Code Code Available 15 MLRSNet: A Multi-label High Spatial Resolution Remote Sensing Dataset for Semantic Scene Understanding Oct 1, 2020 Deep Learning image-classification
Code Code Available 15 MonteBoxFinder: Detecting and Filtering Primitives to Fit a Noisy Point Cloud Jul 28, 2022 Scene Understanding
Code Code Available 15 Dual-Hybrid Attention Network for Specular Highlight Removal Jul 17, 2024 highlight removal Object Recognition
Code Code Available 15 Masked Scene Modeling: Narrowing the Gap Between Supervised and Self-Supervised Learning in 3D Scene Understanding Apr 9, 2025 Scene Understanding Self-Supervised Learning
Code Code Available 15 DTCLMapper: Dual Temporal Consistent Learning for Vectorized HD Map Construction May 9, 2024 Contrastive Learning Scene Understanding
Code Code Available 15 DPF: Learning Dense Prediction Fields with Weak Supervision Mar 29, 2023 Intrinsic Image Decomposition Prediction
Code Code Available 15 Mask4D: End-to-End Mask-Based 4D Panoptic Segmentation for LiDAR Sequences Sep 18, 2023 3D Panoptic Segmentation 4D Panoptic Segmentation
Code Code Available 15 MassMIND: Massachusetts Maritime INfrared Dataset Sep 9, 2022 Instance Segmentation Scene Understanding
Code Code Available 15 A Two-Stage Masked Autoencoder Based Network for Indoor Depth Completion Jun 14, 2024 3D Reconstruction Autonomous Driving
Code Code Available 15 AirObject: A Temporally Evolving Graph Embedding for Object Identification Nov 30, 2021 Graph Attention Graph Embedding
Code Code Available 15 Dynamic Graph Message Passing Networks Aug 19, 2019 Image Classification object-detection
Code Code Available 15 A Hybrid Sparse-Dense Monocular SLAM System for Autonomous Driving Aug 17, 2021 Autonomous Driving Depth Estimation
Code Code Available 15 M3D-RPN: Monocular 3D Region Proposal Network for Object Detection Jul 13, 2019 3D Object Detection 3D Object Detection From Monocular Images
Code Code Available 15 MCTS with Refinement for Proposals Selection Games in Scene Understanding Jul 7, 2022 Scene Understanding
Code Code Available 15 LoLI-Street: Benchmarking Low-Light Image Enhancement and Beyond Oct 13, 2024 Autonomous Driving Autonomous Vehicles
Code Code Available 15 Constructing Metric-Semantic Maps using Floor Plan Priors for Long-Term Indoor Localization Mar 20, 2023 3D Object Detection Indoor Localization
Code Code Available 15 LLM-Empowered Embodied Agent for Memory-Augmented Task Planning in Household Robotics Apr 30, 2025 In-Context Learning Object
Code Code Available 15 3D Neural Embedding Likelihood: Probabilistic Inverse Graphics for Robust 6D Pose Estimation Feb 7, 2023 6D Pose Estimation 6D Pose Estimation using RGB
Code Code Available 15 Digging Into Self-Supervised Monocular Depth Estimation Jun 4, 2018 Camera Pose Estimation Depth Estimation
Code Code Available 15 Logic-RAG: Augmenting Large Multimodal Models with Visual-Spatial Knowledge for Road Scene Understanding Mar 16, 2025 Autonomous Driving RAG
Code Code Available 15 LinkNet: Exploiting Encoder Representations for Efficient Semantic Segmentation Jun 14, 2017 GPU Scene Understanding
Code Code Available 15 Affordance Transfer Learning for Human-Object Interaction Detection Apr 7, 2021 Affordance Detection Affordance Recognition
Code Code Available 15 Living Scenes: Multi-object Relocalization and Reconstruction in Changing 3D Environments Dec 14, 2023 3D Reconstruction Decoder
Code Code Available 15 Divide and Conquer: 3D Point Cloud Instance Segmentation With Point-Wise Binarization Jul 22, 2022 3D Instance Segmentation 3D Object Detection
Code Code Available 15 Dynamic Graph Message Passing Networks for Visual Recognition Sep 20, 2022 image-classification Image Classification
Code Code Available 15 LLaVA-SpaceSGG: Visual Instruct Tuning for Open-vocabulary Scene Graph Generation with Enhanced Spatial Relations Dec 9, 2024 Language Modeling Language Modelling
Code Code Available 15 LWSIS: LiDAR-guided Weakly Supervised Instance Segmentation for Autonomous Driving Dec 7, 2022 Autonomous Driving Instance Segmentation
Code Code Available 15 MGNet: Monocular Geometric Scene Understanding for Autonomous Driving Jun 27, 2022 Autonomous Driving Depth Estimation
Code Code Available 15 Deep learning for radar data exploitation of autonomous vehicle Mar 15, 2022 Autonomous Driving Deep Learning
Code Code Available 15 A Survey on Deep Learning Technique for Video Segmentation Jul 2, 2021 Autonomous Driving Deep Learning
Code Code Available 15 LED: Light Enhanced Depth Estimation at Night Sep 12, 2024 Autonomous Driving Decoder
Code Code Available 15 4D Panoptic LiDAR Segmentation Feb 24, 2021 4D Panoptic Segmentation Benchmarking
Code Code Available 15 DeepPanoContext: Panoramic 3D Scene Understanding with Holistic Scene Context Graph and Relation-based Optimization Aug 24, 2021 Diversity Graph Neural Network
Code Code Available 15 Leveraging Large (Visual) Language Models for Robot 3D Scene Understanding Sep 12, 2022 Common Sense Reasoning Scene Classification
Code Code Available 15 CAT-ViL: Co-Attention Gated Vision-Language Embedding for Visual Question Localized-Answering in Robotic Surgery Jul 11, 2023 Question Answering Scene Understanding
Code Code Available 15 A Survey on Deep Learning for Localization and Mapping: Towards the Age of Spatial Machine Intelligence Jun 22, 2020 Deep Learning Scene Understanding
Code Code Available 15 Collaborative Transformers for Grounded Situation Recognition Mar 30, 2022 Grounded Situation Recognition Image Classification
Code Code Available 15 Deep Learning for Event-based Vision: A Comprehensive Survey and Benchmarks Feb 17, 2023 Deblurring Deep Learning
Code Code Available 15 Dense Audio-Visual Event Localization under Cross-Modal Consistency and Multi-Temporal Granularity Collaboration Dec 17, 2024 audio-visual event localization audio-visual learning
Code Code Available 15 Complementary Random Masking for RGB-Thermal Semantic Segmentation Mar 30, 2023 Scene Understanding Semantic Segmentation
Code Code Available 15 Detecting Human-Object Interaction via Fabricated Compositional Learning Mar 15, 2021 Affordance Recognition Human-Object Interaction Detection
Code Code Available 15 A Survey of World Models for Autonomous Driving Jan 20, 2025 Anomaly Detection Autonomous Driving
Code Code Available 15 Diffusion-SS3D: Diffusion Model for Semi-supervised 3D Object Detection Dec 5, 2023 3D Object Detection Denoising
Code Code Available 15 DIP: Unsupervised Dense In-Context Post-training of Visual Representations Jun 23, 2025 GPU Meta-Learning
Code Code Available 15 DI-V2X: Learning Domain-Invariant Representation for Vehicle-Infrastructure Collaborative 3D Object Detection Dec 25, 2023 3D Object Detection object-detection
Code Code Available 15 Distilled Semantics for Comprehensive Scene Understanding from Videos Mar 31, 2020 Depth Estimation Knowledge Distillation
Code Code Available 15 Affect2MM: Affective Analysis of Multimedia Content Using Emotion Causality Mar 11, 2021 Scene Understanding Time Series
Code Code Available 15