Shape Anchor Guided Holistic Indoor Scene Understanding Sep 20, 2023 3D Object Detection object-detection
Code Code Available 0LiON: Learning Point-wise Abstaining Penalty for LiDAR Outlier DetectioN Using Diverse Synthetic Data Sep 19, 2023 Anomaly Detection Autonomous Driving
Code Code Available 1Mask4D: End-to-End Mask-Based 4D Panoptic Segmentation for LiDAR Sequences Sep 18, 2023 3D Panoptic Segmentation 4D Panoptic Segmentation
Code Code Available 1PanoMixSwap Panorama Mixing via Structural Swapping for Indoor Scene Understanding Sep 18, 2023 Data Augmentation Diversity
— Unverified 0So you think you can track? Sep 13, 2023 Benchmarking Object
— Unverified 0Rank2Tell: A Multimodal Driving Dataset for Joint Importance Ranking and Reasoning Sep 12, 2023 Autonomous Vehicles Question Answering
— Unverified 0AmodalSynthDrive: A Synthetic Amodal Perception Dataset for Autonomous Driving Sep 12, 2023 Autonomous Driving Benchmarking
— Unverified 0HOC-Search: Efficient CAD Model and Pose Retrieval from RGB-D Scans Sep 12, 2023 3D Object Retrieval 3D Scene Reconstruction
Code Code Available 1Can you text what is happening? Integrating pre-trained language encoders into trajectory prediction models for autonomous driving Sep 11, 2023 Autonomous Driving Descriptive
— Unverified 0Multi3DRefer: Grounding Text Description to Multiple 3D Objects Sep 11, 2023 3D visual grounding Contrastive Learning
Code Code Available 1PAg-NeRF: Towards fast and efficient end-to-end panoptic 3D representations for agricultural robotics Sep 11, 2023 3D Reconstruction Camera Localization
— Unverified 0Weakly Supervised Point Clouds Transformer for 3D Object Detection Sep 8, 2023 3D Object Detection Object
— Unverified 0Vote2Cap-DETR++: Decoupling Localization and Describing for End-to-End 3D Dense Captioning Sep 6, 2023 3D dense captioning Caption Generation
Code Code Available 1Structural Concept Learning via Graph Attention for Multi-Level Rearrangement Planning Sep 5, 2023 Graph Attention Object Rearrangement
— Unverified 0OpenIns3D: Snap and Lookup for 3D Open-vocabulary Instance Segmentation Sep 1, 2023 3D Open-Vocabulary Instance Segmentation 3D Open-Vocabulary Object Detection
Code Code Available 2Expanding Frozen Vision-Language Models without Retraining: Towards Improved Robot Perception Aug 31, 2023 Activity Recognition Human Activity Recognition
— Unverified 0Semi-Supervised Semantic Depth Estimation using Symbiotic Transformer and NearFarMix Augmentation Aug 28, 2023 Autonomous Vehicles Depth Estimation
— Unverified 0End-to-end Autonomous Driving using Deep Learning: A Systematic Review Aug 27, 2023 Autonomous Driving object-detection
— Unverified 0Synergizing Contrastive Learning and Optimal Transport for 3D Point Cloud Domain Adaptation Aug 27, 2023 Contrastive Learning Domain Adaptation
— Unverified 0SurGNN: Explainable visual scene understanding and assessment of surgical skill using graph neural networks Aug 24, 2023 Scene Understanding
— Unverified 0SUMMIT: Source-Free Adaptation of Uni-Modal Models to Multi-Modal Targets Aug 23, 2023 Autonomous Navigation Pseudo Label
Code Code Available 1Multi-stage Factorized Spatio-Temporal Representation for RGB-D Action and Gesture Recognition Aug 23, 2023 Gesture Recognition Scene Understanding
Code Code Available 1Understanding Dark Scenes by Contrasting Multi-Modal Observations Aug 23, 2023 Contrastive Learning Scene Understanding
Code Code Available 1ScanNet++: A High-Fidelity Dataset of 3D Indoor Scenes Aug 22, 2023 3D Semantic Segmentation Novel View Synthesis
Code Code Available 2Novel-view Synthesis and Pose Estimation for Hand-Object Interaction from Sparse Views Aug 22, 2023 NeRF Neural Rendering
— Unverified 0Explore and Tell: Embodied Visual Captioning in 3D Environments Aug 21, 2023 Image Captioning Navigate
— Unverified 0Vision Relation Transformer for Unbiased Scene Graph Generation Aug 18, 2023 Decoder Graph Generation
Code Code Available 1Chat-3D: Data-efficiently Tuning Large Language Model for Universal Dialogue of 3D Scenes Aug 17, 2023 Language Modeling Language Modelling
Code Code Available 2CASPNet++: Joint Multi-Agent Motion Prediction Aug 15, 2023 Autonomous Driving motion prediction
— Unverified 0FocusFlow: Boosting Key-Points Optical Flow Estimation for Autonomous Driving Aug 14, 2023 Autonomous Driving Optical Flow Estimation
Code Code Available 1Temporal DINO: A Self-supervised Video Strategy to Enhance Action Prediction Aug 8, 2023 Activity Recognition Autonomous Driving
— Unverified 0Syn-Mediverse: A Multimodal Synthetic Dataset for Intelligent Scene Understanding of Healthcare Facilities Aug 6, 2023 Depth Estimation Instance Segmentation
— Unverified 0Cognitive TransFuser: Semantics-guided Transformer-based Sensor Fusion for Improved Waypoint Prediction Aug 4, 2023 Imitation Learning Scene Understanding
Code Code Available 0Scene-aware Human Pose Generation using Transformer Aug 4, 2023 Knowledge Distillation Scene Understanding
— Unverified 0Weakly Supervised 3D Instance Segmentation without Instance-level Annotations Aug 3, 2023 3D Instance Segmentation Instance Segmentation
— Unverified 0Interpretable End-to-End Driving Model for Implicit Scene Understanding Aug 2, 2023 Graph Generation object-detection
— Unverified 0Gated Driver Attention Predictor Aug 1, 2023 Driver Attention Monitoring Prediction
Code Code Available 0Lowis3D: Language-Driven Open-World Instance-Level 3D Scene Understanding Aug 1, 2023 3D geometry 3D Open-Vocabulary Instance Segmentation
— Unverified 0TaskExpert: Dynamically Assembling Multi-Task Representations with Memorial Mixture-of-Experts Jul 28, 2023 Long-range modeling Mixture-of-Experts
Code Code Available 2OAFuser: Towards Omni-Aperture Fusion for Light Field Semantic Segmentation Jul 28, 2023 Autonomous Driving Scene Understanding
Code Code Available 1Human-centric Scene Understanding for 3D Large-scale Scenarios Jul 26, 2023 Action Recognition Scene Understanding
Code Code Available 1Enhancing image captioning with depth information using a Transformer-based framework Jul 24, 2023 Image Captioning Image Paragraph Captioning
— Unverified 0Challenges for Monocular 6D Object Pose Estimation in Robotics Jul 22, 2023 6D Pose Estimation using RGB Object
— Unverified 0Revisiting Distillation for Continual Learning on Visual Question Localized-Answering in Robotic Surgery Jul 22, 2023 Continual Learning Scene Understanding
Code Code Available 0Improving Online Lane Graph Extraction by Object-Lane Clustering Jul 20, 2023 3D Object Detection Autonomous Driving
— Unverified 0Mining Conditional Part Semantics with Occluded Extrapolation for Human-Object Interaction Detection Jul 19, 2023 Human-Object Interaction Detection Object
— Unverified 0CPCM: Contextual Point Cloud Modeling for Weakly-supervised Point Cloud Semantic Segmentation Jul 19, 2023 Representation Learning Scene Understanding
Code Code Available 1Towards A Unified Agent with Foundation Models Jul 18, 2023 Efficient Exploration Reinforcement Learning (RL)
— Unverified 0Human Action Recognition in Still Images Using ConViT Jul 18, 2023 Action Recognition Action Recognition In Still Images
— Unverified 0A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future Jul 18, 2023 Knowledge Distillation object-detection
Code Code Available 2