Open Scene Understanding: Grounded Situation Recognition Meets Segment Anything for Helping People with Visual Impairments Jul 15, 2023 Decoder Grounded Situation Recognition
Code Code Available 1DeepIPCv2: LiDAR-powered Robust Environmental Perception and Navigational Control for Autonomous Vehicle Jul 13, 2023 Autonomous Driving Scene Understanding
Code Code Available 0The IMPTC Dataset: An Infrastructural Multi-Person Trajectory and Context Dataset Jul 12, 2023 Scene Understanding
Code Code Available 1Smart Infrastructure: A Research Junction Jul 12, 2023 Scene Understanding Synthetic Data Generation
— Unverified 0CAT-ViL: Co-Attention Gated Vision-Language Embedding for Visual Question Localized-Answering in Robotic Surgery Jul 11, 2023 Question Answering Scene Understanding
Code Code Available 1Test-Time Adaptation for Nighttime Color-Thermal Semantic Segmentation Jul 10, 2023 Scene Understanding Semantic Segmentation
— Unverified 0PSDR-Room: Single Photo to Scene using Differentiable Rendering Jul 6, 2023 Scene Understanding
— Unverified 0Towards accurate instance segmentation in large-scale LiDAR point clouds Jul 6, 2023 Clustering Instance Segmentation
Code Code Available 1Object Recognition System on a Tactile Device for Visually Impaired Jul 5, 2023 object-detection Object Detection
— Unverified 0AVSegFormer: Audio-Visual Segmentation with Transformer Jul 3, 2023 Decoder Scene Understanding
Code Code Available 1Artifacts Mapping: Multi-Modal Semantic Mapping for Object Detection and 3D Localization Jul 3, 2023 object-detection Object Detection
— Unverified 0Towards Open Vocabulary Learning: A Survey Jun 28, 2023 Open Set Learning Out-of-Distribution Detection
Code Code Available 2Generalizing Surgical Instruments Segmentation to Unseen Domains with One-to-Many Synthesis Jun 28, 2023 Scene Understanding
Code Code Available 0Style-transfer based Speech and Audio-visual Scene Understanding for Robot Action Sequence Acquisition from Videos Jun 27, 2023 Multi-Task Learning Scene Understanding
— Unverified 0Physion++: Evaluating Physical Scene Understanding that Requires Online Inference of Different Physical Properties Jun 27, 2023 Friction Scene Understanding
— Unverified 0SSC-RS: Elevate LiDAR Semantic Scene Completion with Representation Separation and BEV Fusion Jun 27, 2023 Autonomous Driving Scene Understanding
Code Code Available 1Towards Unseen Triples: Effective Text-Image-joint Learning for Scene Graph Generation Jun 23, 2023 Graph Generation Scene Graph Generation
— Unverified 0OpenMask3D: Open-Vocabulary 3D Instance Segmentation Jun 23, 2023 3D Instance Segmentation 3D Open-Vocabulary Instance Segmentation
Code Code Available 2Semantic-aware Transmission for Robust Point Cloud Classification Jun 23, 2023 Classification Decoder
— Unverified 0Multi-view 3D Object Reconstruction and Uncertainty Modelling with Neural Shape Prior Jun 17, 2023 3D Object Reconstruction Object
Code Code Available 1CorNav: Autonomous Agent with Self-Corrected Planning for Zero-Shot Vision-and-Language Navigation Jun 17, 2023 Decision Making Instruction Following
— Unverified 0PanoOcc: Unified Occupancy Representation for Camera-based 3D Panoptic Segmentation Jun 16, 2023 3D Panoptic Segmentation Autonomous Driving
Code Code Available 1Estimating Generic 3D Room Structures from 2D Annotations Jun 15, 2023 Scene Understanding
Code Code Available 1DORSal: Diffusion for Object-centric Representations of Scenes et al Jun 13, 2023 Neural Rendering Object
— Unverified 0Neural Projection Mapping Using Reflectance Fields Jun 11, 2023 Scene Understanding
— Unverified 0SNeL: A Structured Neuro-Symbolic Language for Entity-Based Multimodal Scene Understanding Jun 9, 2023 Scene Understanding
— Unverified 0SNAP: Self-Supervised Neural Maps for Visual Positioning and Semantic Understanding Jun 8, 2023 Scene Understanding
Code Code Available 1InvPT++: Inverted Pyramid Multi-Task Transformer for Visual Scene Understanding Jun 8, 2023 Decoder Multi-Task Learning
Code Code Available 2TopoMask: Instance-Mask-Based Formulation for the Road Topology Problem via Transformer-Based Architecture Jun 8, 2023 3D Lane Detection Graph Neural Network
— Unverified 0A Dynamic Feature Interaction Framework for Multi-task Visual Perception Jun 8, 2023 Autonomous Driving Depth Estimation
— Unverified 0Towards Label-free Scene Understanding by Vision Foundation Models Jun 6, 2023 image-classification Image Classification
Code Code Available 1Disaster Anomaly Detector via Deeper FCDDs for Explainable Initial Responses Jun 5, 2023 Anomaly Detection Disaster Response
— Unverified 0Recyclable Semi-supervised Method Based on Multi-model Ensemble for Video Scene Parsing Jun 5, 2023 Scene Parsing Scene Understanding
— Unverified 0Multi-CLIP: Contrastive Vision-Language Pre-training for Question Answering tasks in 3D Scenes Jun 4, 2023 Common Sense Reasoning Question Answering
— Unverified 0Towards In-context Scene Understanding Jun 2, 2023 Depth Estimation In-Context Learning
Code Code Available 1Self-supervised Vision Transformers for 3D Pose Estimation of Novel Objects May 31, 2023 3D Pose Estimation Contrastive Learning
Code Code Available 0Point-GCC: Universal Self-supervised 3D Scene Pre-training via Geometry-Color Contrast May 31, 2023 3D Instance Segmentation 3D Object Detection
Code Code Available 1Dynamic Clustering Transformer Network for Point Cloud Segmentation May 30, 2023 Clustering Decoder
— Unverified 0Fine-Grained is Too Coarse: A Novel Data-Centric Approach for Efficient Scene Graph Generation May 30, 2023 Graph Generation Image Generation
Code Code Available 0Multi-Scale Attention for Audio Question Answering May 29, 2023 Audio Question Answering Question Answering
Code Code Available 1Robust Category-Level 3D Pose Estimation from Synthetic Data May 25, 2023 3D Pose Estimation 3D Reconstruction
— Unverified 0Fairness Continual Learning Approach to Semantic Scene Understanding in Open-World Environments May 25, 2023 Continual Learning Continual Semantic Segmentation
— Unverified 0Target-Aware Spatio-Temporal Reasoning via Answering Questions in Dynamics Audio-Visual Scenarios May 21, 2023 Audio-visual Question Answering Audio-Visual Question Answering (AVQA)
Code Code Available 0PanoContext-Former: Panoramic Total Scene Understanding with a Transformer May 21, 2023 3D Object Detection object-detection
— Unverified 0Generating Visual Spatial Description via Holistic 3D Scene Understanding May 19, 2023 Scene Understanding Text Generation
Code Code Available 1Vision-Language Pre-training with Object Contrastive Learning for 3D Scene Understanding May 18, 2023 Contrastive Learning Object
— Unverified 0TextSLAM: Visual SLAM with Semantic Planar Text Features May 17, 2023 Mixed Reality Object SLAM
Code Code Available 2Bridging the Domain Gap: Self-Supervised 3D Scene Understanding with Foundation Models May 15, 2023 3D Object Detection Image Captioning
Code Code Available 1Cross-Modality Time-Variant Relation Learning for Generating Dynamic Scene Graphs May 15, 2023 Relation Scene Graph Generation
Code Code Available 0MetaMorphosis: Task-oriented Privacy Cognizant Feature Generation for Multi-task Learning May 13, 2023 Deep Learning Depth Estimation
— Unverified 0