Surgical Scene Segmentation by Transformer With Asymmetric Feature Enhancement Oct 23, 2024 Anatomy Scene Segmentation
Code Code Available 0PerspectiveNet: Multi-View Perception for Dynamic Scene Understanding Oct 22, 2024 Scene Understanding Text Generation
— Unverified 0Large Language Models for Autonomous Driving (LLM4AD): Concept, Benchmark, Experiments, and Challenges Oct 20, 2024 Autonomous Driving Decision Making
— Unverified 0Part-Whole Relational Fusion Towards Multi-Modal Scene Understanding Oct 19, 2024 Autonomous Driving object-detection
Code Code Available 0SAM-Guided Masked Token Prediction for 3D Scene Understanding Oct 16, 2024 3D Object Detection Knowledge Distillation
— Unverified 03DArticCyclists: Generating Synthetic Articulated 8D Pose-Controllable Cyclist Data for Computer Vision Applications Oct 14, 2024 3DGS 3D Reconstruction
— Unverified 0Enhancing Single Image to 3D Generation using Gaussian Splatting and Hybrid Diffusion Priors Oct 12, 2024 3D Generation 3D geometry
— Unverified 03D Vision-Language Gaussian Splatting Oct 10, 2024 3D Reconstruction Autonomous Driving
— Unverified 0Test-Time Intensity Consistency Adaptation for Shadow Detection Oct 10, 2024 Decoder Diversity
— Unverified 0A transition towards virtual representations of visual scenes Oct 10, 2024 Scene Understanding
— Unverified 0Evaluating the Impact of Point Cloud Colorization on Semantic Segmentation Accuracy Oct 9, 2024 Colorization Point Cloud Segmentation
— Unverified 0Open-RGBT: Open-vocabulary RGB-T Zero-shot Semantic Segmentation in Open-world Environments Oct 9, 2024 Open Vocabulary Semantic Segmentation Open-Vocabulary Semantic Segmentation
— Unverified 0Diffusion Models in 3D Vision: A Survey Oct 7, 2024 Autonomous Driving Computational Efficiency
— Unverified 0Resource-Efficient Multiview Perception: Integrating Semantic Masking with Masked Autoencoders Oct 7, 2024 Multiview Detection Scene Understanding
— Unverified 0In-Place Panoptic Radiance Field Segmentation with Perceptual Prior for 3D Scene Understanding Oct 6, 2024 2D Panoptic Segmentation Autonomous Driving
— Unverified 0Fast Object Detection with a Machine Learning Edge Device Oct 5, 2024 Autonomous Navigation CPU
— Unverified 0SPARTUN3D: Situated Spatial Understanding of 3D World in Large Language Models Oct 4, 2024 Scene Understanding Spatial Reasoning
— Unverified 0RESSCAL3D++: Joint Acquisition and Semantic Segmentation of 3D Point Clouds Oct 3, 2024 Scene Understanding Semantic Segmentation
Code Code Available 0A Critical Assessment of Visual Sound Source Localization Models Including Negative Audio Oct 1, 2024 Scene Understanding Sound Source Localization
Code Code Available 0Class-Agnostic Visio-Temporal Scene Sketch Semantic Segmentation Sep 30, 2024 Image Retrieval Scene Understanding
— Unverified 0You Only Speak Once to See Sep 27, 2024 Contrastive Learning Object
— Unverified 0Scene Understanding in Pick-and-Place Tasks: Analyzing Transformations Between Initial and Final Scenes Sep 26, 2024 object-detection Object Detection
— Unverified 0LLaVA-3D: A Simple yet Effective Pathway to Empowering LMMs with 3D-awareness Sep 26, 2024 3D Question Answering (3D-QA) Position
— Unverified 0OW-Rep: Open World Object Detection with Instance Representation Learning Sep 24, 2024 Novel Class Discovery Object
— Unverified 0Diffusion-based RGB-D Semantic Segmentation with Deformable Attention Transformer Sep 23, 2024 Scene Understanding Semantic Segmentation
— Unverified 0MOSE: Monocular Semantic Reconstruction Using NeRF-Lifted Noisy Priors Sep 21, 2024 2D Semantic Segmentation 3D Semantic Segmentation
— Unverified 0Multilateral Cascading Network for Semantic Segmentation of Large-Scale Outdoor Point Clouds Sep 21, 2024 Scene Understanding Semantic Segmentation
— Unverified 0Relevance-driven Decision Making for Safer and More Efficient Human Robot Collaboration Sep 21, 2024 Collision Avoidance Decision Making
— Unverified 0CLAIR-A: Leveraging Large Language Models to Judge Audio Captions Sep 19, 2024 Audio captioning Language Modeling
Code Code Available 0Towards Global Localization using Multi-Modal Object-Instance Re-Identification Sep 18, 2024 Camera Localization Object
Code Code Available 0Video Token Sparsification for Efficient Multimodal LLMs in Autonomous Driving Sep 16, 2024 Autonomous Driving Logical Reasoning
— Unverified 0DAE-Fuse: An Adaptive Discriminative Autoencoder for Multi-Modality Image Fusion Sep 16, 2024 Autonomous Driving Autonomous Navigation
— Unverified 0Relevance for Human Robot Collaboration Sep 12, 2024 Dimensionality Reduction Scene Understanding
— Unverified 0Towards Localizing Structural Elements: Merging Geometrical Detection with Semantic Verification in RGB-D Data Sep 10, 2024 3D Plane Detection 3d scene graph generation
— Unverified 0Loss Distillation via Gradient Matching for Point Cloud Completion with Weighted Chamfer Distance Sep 10, 2024 Bilevel Optimization Point Cloud Completion
Code Code Available 0TanDepth: Leveraging Global DEMs for Metric Monocular Depth Estimation in UAVs Sep 8, 2024 Depth Estimation Monocular Depth Estimation
— Unverified 0Future Does Matter: Boosting 3D Object Detection with Temporal Motion Estimation in Point Cloud Sequences Sep 6, 2024 3D Object Detection Autonomous Driving
— Unverified 0RCNet: Deep Recurrent Collaborative Network for Multi-View Low-Light Image Enhancement Sep 6, 2024 Image Enhancement Low-Light Image Enhancement
Code Code Available 0Optimizing 3D Gaussian Splatting for Sparse Viewpoint Scene Reconstruction Sep 5, 2024 3DGS 3D Reconstruction
— Unverified 0Can LVLMs Obtain a Driver's License? A Benchmark Towards Reliable AGI for Autonomous Driving Sep 4, 2024 Autonomous Driving Decision Making
— Unverified 0GaussianPU: A Hybrid 2D-3D Upsampling Framework for Enhancing Color Point Clouds via 3D Gaussian Splatting Sep 3, 2024 3DGS GPU
— Unverified 0Leaky Wave Antenna-Equipped RF Chipless Tags for Orientation Estimation Aug 31, 2024 Scene Understanding TAG
— Unverified 0AdaptVision: Dynamic Input Scaling in MLLMs for Versatile Scene Understanding Aug 30, 2024 Language Modelling Large Language Model
Code Code Available 0DriveGenVLM: Real-world Video Generation for Vision Language Model based Autonomous Driving Aug 29, 2024 Autonomous Driving Denoising
— Unverified 0Str-L Pose: Integrating Point and Structured Line for Relative Pose Estimation in Dual-Graph Aug 28, 2024 Autonomous Driving Graph Neural Network
— Unverified 0BOX3D: Lightweight Camera-LiDAR Fusion for 3D Object Detection and Localization Aug 27, 2024 3D Object Detection Benchmarking
— Unverified 0Interactive Occlusion Boundary Estimation through Exploitation of Synthetic Data Aug 27, 2024 Domain Adaptation Scene Understanding
— Unverified 0Handling Geometric Domain Shifts in Semantic Segmentation of Surgical RGB and Hyperspectral Images Aug 27, 2024 Organ Segmentation Scene Segmentation
— Unverified 0FusionSAM: Latent Space driven Segment Anything Model for Multimodal Fusion and Segmentation Aug 26, 2024 Autonomous Driving Image Segmentation
— Unverified 03D-VirtFusion: Synthetic 3D Data Augmentation through Generative Diffusion Models and Controllable Editing Aug 25, 2024 Data Augmentation Diversity
— Unverified 0