Towards Robust Algorithms for Surgical Phase Recognition via Digital Twin-based Scene Representation Oct 26, 2024 Informativeness Scene Understanding
— Unverified 0Surgical Scene Segmentation by Transformer With Asymmetric Feature Enhancement Oct 23, 2024 Anatomy Scene Segmentation
Code Code Available 0PerspectiveNet: Multi-View Perception for Dynamic Scene Understanding Oct 22, 2024 Scene Understanding Text Generation
— Unverified 0Large Language Models for Autonomous Driving (LLM4AD): Concept, Benchmark, Experiments, and Challenges Oct 20, 2024 Autonomous Driving Decision Making
— Unverified 0Part-Whole Relational Fusion Towards Multi-Modal Scene Understanding Oct 19, 2024 Autonomous Driving object-detection
Code Code Available 0ARKit LabelMaker: A New Scale for Indoor 3D Scene Understanding Oct 17, 2024 3D Semantic Segmentation Image Generation
Code Code Available 2VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding Oct 17, 2024 3D geometry 3D visual grounding
Code Code Available 2SAM-Guided Masked Token Prediction for 3D Scene Understanding Oct 16, 2024 3D Object Detection Knowledge Distillation
— Unverified 03DArticCyclists: Generating Synthetic Articulated 8D Pose-Controllable Cyclist Data for Computer Vision Applications Oct 14, 2024 3DGS 3D Reconstruction
— Unverified 0LoLI-Street: Benchmarking Low-Light Image Enhancement and Beyond Oct 13, 2024 Autonomous Driving Autonomous Vehicles
Code Code Available 1Enhancing Single Image to 3D Generation using Gaussian Splatting and Hybrid Diffusion Priors Oct 12, 2024 3D Generation 3D geometry
— Unverified 03D Vision-Language Gaussian Splatting Oct 10, 2024 3D Reconstruction Autonomous Driving
— Unverified 0A transition towards virtual representations of visual scenes Oct 10, 2024 Scene Understanding
— Unverified 0Test-Time Intensity Consistency Adaptation for Shadow Detection Oct 10, 2024 Decoder Diversity
— Unverified 0Open-RGBT: Open-vocabulary RGB-T Zero-shot Semantic Segmentation in Open-world Environments Oct 9, 2024 Open Vocabulary Semantic Segmentation Open-Vocabulary Semantic Segmentation
— Unverified 0Evaluating the Impact of Point Cloud Colorization on Semantic Segmentation Accuracy Oct 9, 2024 Colorization Point Cloud Segmentation
— Unverified 0Resource-Efficient Multiview Perception: Integrating Semantic Masking with Masked Autoencoders Oct 7, 2024 Multiview Detection Scene Understanding
— Unverified 0Diffusion Models in 3D Vision: A Survey Oct 7, 2024 Autonomous Driving Computational Efficiency
— Unverified 0In-Place Panoptic Radiance Field Segmentation with Perceptual Prior for 3D Scene Understanding Oct 6, 2024 2D Panoptic Segmentation Autonomous Driving
— Unverified 0Fast Object Detection with a Machine Learning Edge Device Oct 5, 2024 Autonomous Navigation CPU
— Unverified 0SPARTUN3D: Situated Spatial Understanding of 3D World in Large Language Models Oct 4, 2024 Scene Understanding Spatial Reasoning
— Unverified 0RESSCAL3D++: Joint Acquisition and Semantic Segmentation of 3D Point Clouds Oct 3, 2024 Scene Understanding Semantic Segmentation
Code Code Available 0A Critical Assessment of Visual Sound Source Localization Models Including Negative Audio Oct 1, 2024 Scene Understanding Sound Source Localization
Code Code Available 0Procedure-Aware Surgical Video-language Pretraining with Hierarchical Knowledge Augmentation Sep 30, 2024 Cross-Modal Retrieval Dynamic Time Warping
Code Code Available 2Class-Agnostic Visio-Temporal Scene Sketch Semantic Segmentation Sep 30, 2024 Image Retrieval Scene Understanding
— Unverified 0You Only Speak Once to See Sep 27, 2024 Contrastive Learning Object
— Unverified 0LLaVA-3D: A Simple yet Effective Pathway to Empowering LMMs with 3D-awareness Sep 26, 2024 3D Question Answering (3D-QA) Position
— Unverified 0Scene Understanding in Pick-and-Place Tasks: Analyzing Transformations Between Initial and Final Scenes Sep 26, 2024 object-detection Object Detection
— Unverified 0OW-Rep: Open World Object Detection with Instance Representation Learning Sep 24, 2024 Novel Class Discovery Object
— Unverified 0Learning Multiple Probabilistic Decisions from Latent World Model in Autonomous Driving Sep 24, 2024 Autonomous Driving Imitation Learning
Code Code Available 2Diffusion-based RGB-D Semantic Segmentation with Deformable Attention Transformer Sep 23, 2024 Scene Understanding Semantic Segmentation
— Unverified 0Multilateral Cascading Network for Semantic Segmentation of Large-Scale Outdoor Point Clouds Sep 21, 2024 Scene Understanding Semantic Segmentation
— Unverified 0Relevance-driven Decision Making for Safer and More Efficient Human Robot Collaboration Sep 21, 2024 Collision Avoidance Decision Making
— Unverified 0MOSE: Monocular Semantic Reconstruction Using NeRF-Lifted Noisy Priors Sep 21, 2024 2D Semantic Segmentation 3D Semantic Segmentation
— Unverified 0Hier-SLAM: Scaling-up Semantics in SLAM with a Hierarchically Categorical Gaussian Splatting Sep 19, 2024 Scene Understanding Semantic Segmentation
Code Code Available 2CLAIR-A: Leveraging Large Language Models to Judge Audio Captions Sep 19, 2024 Audio captioning Language Modeling
Code Code Available 0DAF-Net: A Dual-Branch Feature Decomposition Fusion Network with Domain Adaptive for Infrared and Visible Image Fusion Sep 18, 2024 Infrared And Visible Image Fusion Scene Understanding
Code Code Available 1Towards Global Localization using Multi-Modal Object-Instance Re-Identification Sep 18, 2024 Camera Localization Object
Code Code Available 0Video Token Sparsification for Efficient Multimodal LLMs in Autonomous Driving Sep 16, 2024 Autonomous Driving Logical Reasoning
— Unverified 0DAE-Fuse: An Adaptive Discriminative Autoencoder for Multi-Modality Image Fusion Sep 16, 2024 Autonomous Driving Autonomous Navigation
— Unverified 0PrimeDepth: Efficient Monocular Depth Estimation with a Stable Diffusion Preimage Sep 13, 2024 Depth Estimation Monocular Depth Estimation
Code Code Available 2Relevance for Human Robot Collaboration Sep 12, 2024 Dimensionality Reduction Scene Understanding
— Unverified 0LED: Light Enhanced Depth Estimation at Night Sep 12, 2024 Autonomous Driving Decoder
Code Code Available 1Loss Distillation via Gradient Matching for Point Cloud Completion with Weighted Chamfer Distance Sep 10, 2024 Bilevel Optimization Point Cloud Completion
Code Code Available 0Towards Localizing Structural Elements: Merging Geometrical Detection with Semantic Verification in RGB-D Data Sep 10, 2024 3D Plane Detection 3d scene graph generation
— Unverified 0Online 3D reconstruction and dense tracking in endoscopic videos Sep 9, 2024 3D Reconstruction 3D Scene Reconstruction
Code Code Available 1TanDepth: Leveraging Global DEMs for Metric Monocular Depth Estimation in UAVs Sep 8, 2024 Depth Estimation Monocular Depth Estimation
— Unverified 0RCNet: Deep Recurrent Collaborative Network for Multi-View Low-Light Image Enhancement Sep 6, 2024 Image Enhancement Low-Light Image Enhancement
Code Code Available 0Future Does Matter: Boosting 3D Object Detection with Temporal Motion Estimation in Point Cloud Sequences Sep 6, 2024 3D Object Detection Autonomous Driving
— Unverified 0Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding Sep 5, 2024 Question Answering Scene Understanding
Code Code Available 2