Neighbor-Vote: Improving Monocular 3D Object Detection through Neighbor Distance Voting Jul 6, 2021 3D Object Detection Autonomous Driving
Code Code Available 0Multi-task Planar Reconstruction with Feature Warping Guidance Nov 25, 2023 3D Reconstruction Instance Segmentation
Code Code Available 0Multi-task Geometric Estimation of Depth and Surface Normal from Monocular 360° Images Nov 4, 2024 Multi-Task Learning Scene Understanding
Code Code Available 0Holistic 3D Scene Parsing and Reconstruction from a Single RGB Image Aug 7, 2018 3D Object Detection Monocular 3D Object Detection
Code Code Available 0Multi-Resolution Multi-Modal Sensor Fusion For Remote Sensing Data With Label Uncertainty May 2, 2018 Scene Understanding Sensor Fusion
Code Code Available 0ShelfNet for Fast Semantic Segmentation Nov 27, 2018 Autonomous Driving Decoder
Code Code Available 0Multimodal Scale Consistency and Awareness for Monocular Self-Supervised Depth Estimation Mar 3, 2021 Autonomous Driving Depth Estimation
Code Code Available 0Deep Depth from Defocus: how can defocus blur improve 3D estimation using dense neural networks? Sep 5, 2018 3D Reconstruction Depth Estimation
Code Code Available 0BACS: Background Aware Continual Semantic Segmentation Apr 19, 2024 Autonomous Driving Continual Learning
Code Code Available 0RESSCAL3D++: Joint Acquisition and Semantic Segmentation of 3D Point Clouds Oct 3, 2024 Scene Understanding Semantic Segmentation
Code Code Available 0ResUNet-a: a deep learning framework for semantic segmentation of remotely sensed data Apr 1, 2019 Scene Parsing Scene Understanding
Code Code Available 0Hierarchical Superpixel Segmentation via Structural Information Theory Jan 13, 2025 graph construction graph partitioning
Code Code Available 0Hierarchical Spatial Proximity Reasoning for Vision-and-Language Navigation Mar 18, 2024 Common Sense Reasoning Efficient Exploration
Code Code Available 0Veritatem Dies Aperit- Temporally Consistent Depth Prediction Enabled by a Multi-Task Geometric and Semantic Scene Understanding Approach Mar 26, 2019 Autonomous Driving Depth Completion
Code Code Available 0Veritatem Dies Aperit - Temporally Consistent Depth Prediction Enabled by a Multi-Task Geometric and Semantic Scene Understanding Approach Jun 1, 2019 Autonomous Driving Depth Completion
Code Code Available 0Hierarchical Context Transformer for Multi-level Semantic Scene Understanding Feb 21, 2025 Contrastive Learning Representation Learning
Code Code Available 0Grid-augmented vision: A simple yet effective approach for enhanced spatial understanding in multi-modal agents Nov 27, 2024 Autonomous Navigation Object Recognition
Code Code Available 0Revisiting Distillation for Continual Learning on Visual Question Localized-Answering in Robotic Surgery Jul 22, 2023 Continual Learning Scene Understanding
Code Code Available 0MultiDepth: Single-Image Depth Estimation via Multi-Task Regression and Classification Jul 25, 2019 Autonomous Vehicles Classification
Code Code Available 0MovSAM: A Single-image Moving Object Segmentation Framework Based on Deep Thinking Apr 9, 2025 Autonomous Driving Language Modeling
Code Code Available 0MonoGRNet: A Geometric Reasoning Network for Monocular 3D Object Localization Nov 26, 2018 2D Object Detection 3D Object Detection
Code Code Available 0Monocular 3D Object Detection with Pseudo-LiDAR Point Cloud Mar 23, 2019 3D Object Detection Depth Estimation
Code Code Available 0DC-Scene: Data-Centric Learning for 3D Scene Understanding May 21, 2025 Autonomous Driving Scene Understanding
Code Code Available 0RIO: 3D Object Instance Re-Localization in Changing Indoor Environments Aug 16, 2019 Object Scene Understanding
Code Code Available 0Video-3D LLM: Learning Position-Aware Video Representation for 3D Scene Understanding Nov 30, 2024 3D Question Answering (3D-QA) Position
Code Code Available 0Good at captioning, bad at counting: Benchmarking GPT-4V on Earth observation data Jan 31, 2024 Benchmarking Change Detection
Code Code Available 0Modeling Expectation Violation in Intuitive Physics with Coarse Probabilistic Object Representations Dec 1, 2019 Scene Understanding
Code Code Available 0General-Purpose Deep Point Cloud Feature Extractor Mar 12, 2018 3D Object Classification 3D Point Cloud Classification
Code Code Available 0Generalizing Surgical Instruments Segmentation to Unseen Domains with One-to-Many Synthesis Jun 28, 2023 Scene Understanding
Code Code Available 0APCoTTA: Continual Test-Time Adaptation for Semantic Segmentation of Airborne LiDAR Point Clouds May 15, 2025 Point Cloud Segmentation Scene Understanding
Code Code Available 0Gated Driver Attention Predictor Aug 1, 2023 Driver Attention Monitoring Prediction
Code Code Available 0A Critical Assessment of Visual Sound Source Localization Models Including Negative Audio Oct 1, 2024 Scene Understanding Sound Source Localization
Code Code Available 0Model-based inexact graph matching on top of CNNs for semantic scene understanding Jan 18, 2023 Brain Segmentation Deep Learning
Code Code Available 0Gated2Depth: Real-time Dense Lidar from Gated Images Feb 13, 2019 Scene Understanding
Code Code Available 0GaIA: Graphical Information Gain based Attention Network for Weakly Supervised Point Cloud Semantic Segmentation Oct 2, 2022 Scene Understanding Segmentation
Code Code Available 0MLM: A Benchmark Dataset for Multitask Learning with Multiple Languages and Modalities Aug 14, 2020 Representation Learning Scene Understanding
Code Code Available 0FunnyNet-W: Multimodal Learning of Funny Moments in Videos in the Wild Jan 8, 2024 Language Modelling Large Language Model
Code Code Available 0Rotation Invariant Convolutions for 3D Point Clouds Deep Learning Aug 17, 2019 Deep Learning Scene Understanding
Code Code Available 0MLLM-SUL: Multimodal Large Language Model for Semantic Scene Understanding and Localization in Traffic Scenarios Dec 27, 2024 Autonomous Driving Language Modeling
Code Code Available 0Mitigating Object Dependencies: Improving Point Cloud Self-Supervised Learning through Object Exchange Apr 11, 2024 Object Scene Understanding
Code Code Available 0DA-RNN: Semantic Mapping with Data Associated Recurrent Neural Networks Mar 9, 2017 Scene Understanding
Code Code Available 0MGNiceNet: Unified Monocular Geometric Scene Understanding Nov 18, 2024 Autonomous Driving Autonomous Vehicles
Code Code Available 0MetricGold: Leveraging Text-To-Image Latent Diffusion Models for Metric Depth Estimation Nov 16, 2024 Depth Estimation Monocular Depth Estimation
Code Code Available 0Collaborative Propagation on Multiple Instance Graphs for 3D Instance Segmentation with Single-point Supervision Aug 10, 2022 3D Instance Segmentation Instance Segmentation
Code Code Available 0Improving Social Awareness Through DANTE: A Deep Affinity Network for Clustering Conversational Interactants Jul 24, 2019 Clustering Graph Clustering
Code Code Available 0DADA: Driver Attention Prediction in Driving Accident Scenarios Dec 18, 2019 Driver Attention Monitoring Prediction
Code Code Available 0Structure-Aware Residual Pyramid Network for Monocular Depth Estimation Jul 13, 2019 Decoder Depth Estimation
Code Code Available 0METEOR Guided Divergence for Video Captioning Dec 20, 2022 Hierarchical Reinforcement Learning Scene Understanding
Code Code Available 0MC-PanDA: Mask Confidence for Panoptic Domain Adaptation Jul 19, 2024 Domain Adaptation Panoptic Segmentation
Code Code Available 0Cross-Modality Time-Variant Relation Learning for Generating Dynamic Scene Graphs May 15, 2023 Relation Scene Graph Generation
Code Code Available 0