MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering May 20, 2024 Benchmarking Question Answering
Code Code Available 2Beyond Appearances: Material Segmentation with Embedded Spectral Information from RGB-D imagery May 17, 2024 Material Classification Material Recognition
Code Code Available 1Grounded 3D-LLM with Referent Tokens May 16, 2024 Dense Captioning Diversity
Code Code Available 2A Preprocessing and Postprocessing Voxel-based Method for LiDAR Semantic Segmentation Improvement in Long Distance May 16, 2024 LIDAR Semantic Segmentation Scene Understanding
— Unverified 0When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models May 16, 2024 In-Context Learning Question Answering
Code Code Available 74D Panoptic Scene Graph Generation May 16, 2024 4D Panoptic Segmentation Graph Generation
Code Code Available 33D Shape Augmentation with Content-Aware Shape Resizing May 15, 2024 3D Generation Scene Understanding
— Unverified 0BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation May 15, 2024 Dataset Generation Scene Understanding
— Unverified 0Pre-trained Text-to-Image Diffusion Models Are Versatile Representation Learners for Control May 9, 2024 Representation Learning Scene Understanding
Code Code Available 1DTCLMapper: Dual Temporal Consistent Learning for Vectorized HD Map Construction May 9, 2024 Contrastive Learning Scene Understanding
Code Code Available 1Multi-Modal Data-Efficient 3D Scene Understanding for Autonomous Driving May 8, 2024 Autonomous Driving LIDAR Semantic Segmentation
Code Code Available 3OpenESS: Event-based Semantic Scene Understanding with Open Vocabularies May 8, 2024 Domain Adaptation Scene Understanding
Code Code Available 2DriveWorld: 4D Pre-trained Scene Understanding via World Models for Autonomous Driving May 7, 2024 3D Object Detection Autonomous Driving
— Unverified 0Q-GroundCAM: Quantifying Grounding in Vision Language Models via GradCAM Apr 29, 2024 Phrase Grounding Scene Understanding
— Unverified 0Seeing Beyond Classes: Zero-Shot Grounded Situation Recognition via Language Explainer Apr 24, 2024 Grounded Situation Recognition Scene Understanding
— Unverified 0On Support Relations Inference and Scene Hierarchy Graph Construction from Point Cloud in Clustered Environments Apr 22, 2024 Combinatorial Optimization graph construction
— Unverified 0CloudFort: Enhancing Robustness of 3D Point Cloud Classification Against Backdoor Attacks via Spatial Partitioning and Ensemble Prediction Apr 22, 2024 3D Point Cloud Classification Autonomous Vehicles
— Unverified 0BACS: Background Aware Continual Semantic Segmentation Apr 19, 2024 Autonomous Driving Continual Learning
Code Code Available 0Unified Scene Representation and Reconstruction for 3D Large Language Models Apr 19, 2024 3D Reconstruction Scene Understanding
— Unverified 0SPIdepth: Strengthened Pose Information for Self-supervised Monocular Depth Estimation Apr 18, 2024 Autonomous Driving Depth Estimation
Code Code Available 2AccidentBlip: Agent of Accident Warning based on MA-former Apr 18, 2024 Language Modelling Large Language Model
— Unverified 0Multimodal 3D Object Detection on Unseen Domains Apr 17, 2024 3D Object Detection Autonomous Driving
— Unverified 0PyTorchGeoNodes: Enabling Differentiable Shape Programs for 3D Shape Reconstruction Apr 16, 2024 3D Reconstruction 3D Shape Reconstruction
Code Code Available 1ECLAIR: A High-Fidelity Aerial LiDAR Dataset for Semantic Segmentation Apr 16, 2024 3D Semantic Segmentation Management
Code Code Available 1PreGSU-A Generalized Traffic Scene Understanding Model for Autonomous Driving based on Pre-trained Graph Attention Network Apr 16, 2024 Autonomous Driving Feature Engineering
— Unverified 0Gaga: Group Any Gaussians via 3D-aware Memory Bank Apr 11, 2024 Contrastive Learning Object Tracking
— Unverified 0Depth Estimation using Weighted-loss and Transfer Learning Apr 11, 2024 Autonomous Vehicles Decoder
— Unverified 0Mitigating Object Dependencies: Improving Point Cloud Self-Supervised Learning through Object Exchange Apr 11, 2024 Object Scene Understanding
Code Code Available 0Incorporating Explanations into Human-Machine Interfaces for Trust and Situation Awareness in Autonomous Vehicles Apr 10, 2024 Autonomous Vehicles Scene Understanding
— Unverified 0O2V-Mapping: Online Open-Vocabulary Mapping with Neural Implicit Representation Apr 10, 2024 Image Segmentation Object
— Unverified 0DaF-BEVSeg: Distortion-aware Fisheye Camera based Bird's Eye View Segmentation with Occlusion Reasoning Apr 9, 2024 BEV Segmentation Scene Understanding
— Unverified 0QueSTMaps: Queryable Semantic Topological Maps for 3D Scene Understanding Apr 9, 2024 Scene Understanding Segmentation
— Unverified 0Panoptic Perception: A Novel Task and Fine-grained Dataset for Universal Remote Sensing Image Interpretation Apr 6, 2024 Image Captioning Instance Segmentation
— Unverified 0Sigma: Siamese Mamba Network for Multi-Modal Semantic Segmentation Apr 5, 2024 Decoder Mamba
Code Code Available 3You Only Scan Once: A Dynamic Scene Reconstruction Pipeline for 6-DoF Robotic Grasping of Novel Objects Apr 4, 2024 Object Pose Tracking
— Unverified 0GOV-NeSF: Generalizable Open-Vocabulary Neural Semantic Fields Apr 1, 2024 Open Vocabulary Semantic Segmentation Open-Vocabulary Semantic Segmentation
Code Code Available 1Improving Visual Recognition with Hyperbolical Visual Hierarchy Mapping Apr 1, 2024 image-classification Image Classification
Code Code Available 1NeRF-MAE: Masked AutoEncoders for Self-Supervised 3D Representation Learning for Neural Radiance Fields Apr 1, 2024 3D Object Detection NeRF
Code Code Available 2MM3DGS SLAM: Multi-modal 3D Gaussian Splatting for SLAM Using Vision, Depth, and Inertial Measurements Apr 1, 2024 3DGS Scene Understanding
— Unverified 0360+x: A Panoptic Multi-modal Scene Understanding Dataset Apr 1, 2024 Scene Understanding
— Unverified 0Adapting to Length Shift: FlexiLength Network for Trajectory Prediction Mar 31, 2024 Autonomous Driving Prediction
— Unverified 0Neural Radiance Field-based Visual Rendering: A Comprehensive Review Mar 31, 2024 NeRF Scene Understanding
— Unverified 0VSRD: Instance-Aware Volumetric Silhouette Rendering for Weakly Supervised 3D Object Detection Mar 29, 2024 3D Object Detection Depth Estimation
Code Code Available 1HGS-Mapping: Online Dense Mapping Using Hybrid Gaussian Representation in Urban Scenes Mar 29, 2024 3DGS Autonomous Vehicles
— Unverified 0Efficient 3D Instance Mapping and Localization with Neural Fields Mar 28, 2024 3D Instance Segmentation Image Segmentation
— Unverified 0Object Pose Estimation via the Aggregation of Diffusion Features Mar 27, 2024 Pose Estimation Scene Understanding
Code Code Available 1Towards Trustworthy Automated Driving through Qualitative Scene Understanding and Explanations Mar 25, 2024 Scene Understanding
— Unverified 0Calib3D: Calibrating Model Preferences for Reliable 3D Scene Understanding Mar 25, 2024 Data Augmentation Scene Understanding
Code Code Available 2Is Your LiDAR Placement Optimized for 3D Scene Understanding? Mar 25, 2024 3D Object Detection LIDAR Semantic Segmentation
Code Code Available 2DOCTR: Disentangled Object-Centric Transformer for Point Scene Understanding Mar 25, 2024 Decoder Object
Code Code Available 0