PC-BEV: An Efficient Polar-Cartesian BEV Fusion Framework for LiDAR Semantic Segmentation Dec 19, 2024 LIDAR Semantic Segmentation Scene Understanding
Code Code Available 1Dense Audio-Visual Event Localization under Cross-Modal Consistency and Multi-Temporal Granularity Collaboration Dec 17, 2024 audio-visual event localization audio-visual learning
Code Code Available 1WiseAD: Knowledge Augmented End-to-End Autonomous Driving with Vision-Language Model Dec 13, 2024 Autonomous Driving Decision Making
Code Code Available 1LLaVA-SpaceSGG: Visual Instruct Tuning for Open-vocabulary Scene Graph Generation with Enhanced Spatial Relations Dec 9, 2024 Language Modeling Language Modelling
Code Code Available 1Bootstraping Clustering of Gaussians for View-consistent 3D Scene Understanding Nov 29, 2024 3D geometry 3DGS
Code Code Available 1ROOT: VLM based System for Indoor Scene Understanding and Beyond Nov 24, 2024 Scene Generation Scene Understanding
Code Code Available 1TESGNN: Temporal Equivariant Scene Graph Neural Networks for Efficient and Robust Multi-View 3D Scene Understanding Nov 15, 2024 Graph Matching Graph Neural Network
Code Code Available 1LoLI-Street: Benchmarking Low-Light Image Enhancement and Beyond Oct 13, 2024 Autonomous Driving Autonomous Vehicles
Code Code Available 1DAF-Net: A Dual-Branch Feature Decomposition Fusion Network with Domain Adaptive for Infrared and Visible Image Fusion Sep 18, 2024 Infrared And Visible Image Fusion Scene Understanding
Code Code Available 1LED: Light Enhanced Depth Estimation at Night Sep 12, 2024 Autonomous Driving Decoder
Code Code Available 1Online 3D reconstruction and dense tracking in endoscopic videos Sep 9, 2024 3D Reconstruction 3D Scene Reconstruction
Code Code Available 1UrBench: A Comprehensive Benchmark for Evaluating Large Multimodal Models in Multi-View Urban Scenarios Aug 30, 2024 Attribute geo-localization
Code Code Available 1MTMamba++: Enhancing Multi-Task Dense Scene Understanding via Mamba-Based Decoders Aug 27, 2024 Decoder Mamba
Code Code Available 1RSTeller: Scaling Up Visual Language Modeling in Remote Sensing with Rich Linguistic Semantics from Openly Available Data and Large Language Models Aug 27, 2024 Descriptive Language Modeling
Code Code Available 1OpenScan: A Benchmark for Generalized Open-Vocabulary 3D Scene Understanding Aug 20, 2024 Object Scene Understanding
Code Code Available 1Query3D: LLM-Powered Open-Vocabulary Scene Segmentation with Language Embedded 3D Gaussian Aug 7, 2024 Autonomous Driving object-detection
Code Code Available 1Dynamic Scene Understanding through Object-Centric Voxelization and Neural Rendering Jul 30, 2024 Inverse Rendering NeRF
Code Code Available 1General Geometry-aware Weakly Supervised 3D Object Detection Jul 18, 2024 3D Object Detection Object
Code Code Available 1Dual-Hybrid Attention Network for Specular Highlight Removal Jul 17, 2024 highlight removal Object Recognition
Code Code Available 1No Train, all Gain: Self-Supervised Gradients Improve Deep Frozen Representations Jul 15, 2024 All Image Retrieval
Code Code Available 1MTMamba: Enhancing Multi-Task Dense Scene Understanding by Mamba-Based Decoders Jul 2, 2024 Boundary Detection Human Parsing
Code Code Available 1Uni-DVPS: Unified Model for Depth-Aware Video Panoptic Segmentation Jul 1, 2024 Autonomous Driving Decoder
Code Code Available 1CSFNet: A Cosine Similarity Fusion Network for Real-Time RGB-X Semantic Segmentation of Driving Scenes Jul 1, 2024 Autonomous Vehicles Image Segmentation
Code Code Available 1A Two-Stage Masked Autoencoder Based Network for Indoor Depth Completion Jun 14, 2024 3D Reconstruction Autonomous Driving
Code Code Available 1MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding Jun 13, 2024 Multiple-choice Scene Understanding
Code Code Available 1CoPeD-Advancing Multi-Robot Collaborative Perception: A Comprehensive Dataset in Real-World Environments May 23, 2024 Pose Estimation Scene Understanding
Code Code Available 1Beyond Appearances: Material Segmentation with Embedded Spectral Information from RGB-D imagery May 17, 2024 Material Classification Material Recognition
Code Code Available 1Pre-trained Text-to-Image Diffusion Models Are Versatile Representation Learners for Control May 9, 2024 Representation Learning Scene Understanding
Code Code Available 1DTCLMapper: Dual Temporal Consistent Learning for Vectorized HD Map Construction May 9, 2024 Contrastive Learning Scene Understanding
Code Code Available 1ECLAIR: A High-Fidelity Aerial LiDAR Dataset for Semantic Segmentation Apr 16, 2024 3D Semantic Segmentation Management
Code Code Available 1PyTorchGeoNodes: Enabling Differentiable Shape Programs for 3D Shape Reconstruction Apr 16, 2024 3D Reconstruction 3D Shape Reconstruction
Code Code Available 1GOV-NeSF: Generalizable Open-Vocabulary Neural Semantic Fields Apr 1, 2024 Open Vocabulary Semantic Segmentation Open-Vocabulary Semantic Segmentation
Code Code Available 1Improving Visual Recognition with Hyperbolical Visual Hierarchy Mapping Apr 1, 2024 image-classification Image Classification
Code Code Available 1VSRD: Instance-Aware Volumetric Silhouette Rendering for Weakly Supervised 3D Object Detection Mar 29, 2024 3D Object Detection Depth Estimation
Code Code Available 1Object Pose Estimation via the Aggregation of Diffusion Features Mar 27, 2024 Pose Estimation Scene Understanding
Code Code Available 1AutoInst: Automatic Instance-Based Segmentation of LiDAR 3D Scans Mar 24, 2024 3D Instance Segmentation Instance Segmentation
Code Code Available 1What if...?: Thinking Counterfactual Keywords Helps to Mitigate Hallucination in Large Multi-modal Models Mar 20, 2024 counterfactual Hallucination
Code Code Available 1GroupContrast: Semantic-aware Self-supervised Representation Learning for 3D Understanding Mar 14, 2024 Contrastive Learning Representation Learning
Code Code Available 1Optimizing Latent Graph Representations of Surgical Scenes for Zero-Shot Domain Transfer Mar 11, 2024 Anatomy Disentanglement
Code Code Available 1Stealing Stable Diffusion Prior for Robust Monocular Depth Estimation Mar 8, 2024 Depth Estimation Monocular Depth Estimation
Code Code Available 1WHU-Synthetic: A Synthetic Perception Dataset for 3-D Multitask Model Research Feb 29, 2024 3D Reconstruction Attribute
Code Code Available 1Semantically-aware Neural Radiance Fields for Visual Scene Understanding: A Comprehensive Review Feb 17, 2024 Panoptic Segmentation Scene Segmentation
Code Code Available 1Towards Precise 3D Human Pose Estimation with Multi-Perspective Spatial-Temporal Relational Transformers Jan 30, 2024 3D Human Pose Estimation Pose Estimation
Code Code Available 1UniM-OV3D: Uni-Modality Open-Vocabulary 3D Scene Understanding with Fine-Grained Feature Representation Jan 21, 2024 Instance Segmentation Scene Understanding
Code Code Available 1RSUD20K: A Dataset for Road Scene Understanding In Autonomous Driving Jan 14, 2024 Autonomous Driving Benchmarking
Code Code Available 13DMIT: 3D Multi-modal Instruction Tuning for Scene Understanding Jan 6, 2024 Scene Understanding Visual Question Answering (VQA)
Code Code Available 1DI-V2X: Learning Domain-Invariant Representation for Vehicle-Infrastructure Collaborative 3D Object Detection Dec 25, 2023 3D Object Detection object-detection
Code Code Available 1WildScenes: A Benchmark for 2D and 3D Semantic Segmentation in Large-scale Natural Environments Dec 23, 2023 3D Semantic Segmentation Domain Adaptation
Code Code Available 1Pola4All: survey of polarimetric applications and an open-source toolkit to analyze polarization Dec 22, 2023 3D Reconstruction Depth Estimation
Code Code Available 1Open3DIS: Open-Vocabulary 3D Instance Segmentation with 2D Mask Guidance Dec 17, 2023 3D Instance Segmentation 3D Open-Vocabulary Instance Segmentation
Code Code Available 1