Stag-1: Towards Realistic 4D Driving Simulation with Video Generation Model Dec 6, 2024 Autonomous Driving Autonomous Vehicles
Code Code Available 2EmbodiedOcc: Embodied 3D Occupancy Prediction for Vision-based Online Scene Understanding Dec 5, 2024 Prediction Scene Understanding
Code Code Available 2Designing DNNs for a trade-off between robustness and processing performance in embedded devices Dec 4, 2024 Autonomous Driving Quantization
— Unverified 0Assessing the performance of CT image denoisers using Laguerre-Gauss Channelized Hotelling Observer for lesion detection Dec 4, 2024 Deep Learning Denoising
— Unverified 0SparseLGS: Sparse View Language Embedded Gaussian Splatting Dec 3, 2024 Scene Understanding
— Unverified 0BYE: Build Your Encoder with One Sequence of Exploration Data for Long-Term Dynamic Scene Understanding Dec 3, 2024 Motion Estimation Object
— Unverified 0Holistic Understanding of 3D Scenes as Universal Scene Description Dec 2, 2024 Instance Segmentation Mixed Reality
— Unverified 0LSceneLLM: Enhancing Large 3D Scene Understanding Using Adaptive Visual Preferences Dec 2, 2024 Embodied Question Answering Question Answering
Code Code Available 2A Semantic Communication System for Real-time 3D Reconstruction Tasks Dec 2, 2024 3D Reconstruction Scene Understanding
— Unverified 0Occam's LGS: A Simple Approach for Language Gaussian Splatting Dec 2, 2024 3DGS 3D Reconstruction
— Unverified 0ChatSplat: 3D Conversational Gaussian Splatting Dec 1, 2024 Large Language Model Scene Understanding
— Unverified 0Video-3D LLM: Learning Position-Aware Video Representation for 3D Scene Understanding Nov 30, 2024 3D Question Answering (3D-QA) Position
Code Code Available 0Bootstraping Clustering of Gaussians for View-consistent 3D Scene Understanding Nov 29, 2024 3D geometry 3DGS
Code Code Available 1SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation Nov 29, 2024 Motion Planning RAG
— Unverified 0Quantifying the synthetic and real domain gap in aerial scene understanding Nov 29, 2024 Domain Adaptation Scene Understanding
— Unverified 0SceneTAP: Scene-Coherent Typographic Adversarial Planner against Vision-Language Models in Real-World Environments Nov 28, 2024 Adversarial Text Scene Understanding
— Unverified 0GEOBench-VLM: Benchmarking Vision-Language Models for Geospatial Tasks Nov 28, 2024 Benchmarking Object Counting
Code Code Available 2InstanceGaussian: Appearance-Semantic Joint Gaussian Representation for 3D Instance-Level Perception Nov 28, 2024 3DGS Autonomous Driving
— Unverified 0On-chip Hyperspectral Image Segmentation with Fully Convolutional Networks for Scene Understanding in Autonomous Driving Nov 28, 2024 Autonomous Driving Hyperspectral Image Segmentation
— Unverified 0Reconstructing Animals and the Wild Nov 27, 2024 3D Reconstruction Scene Understanding
— Unverified 0Grid-augmented vision: A simple yet effective approach for enhanced spatial understanding in multi-modal agents Nov 27, 2024 Autonomous Navigation Object Recognition
Code Code Available 0LoCATe-GAT: Modeling Multi-Scale Local Context and Action Relationships for Zero-Shot Action Recognition Nov 27, 2024 Action Recognition Graph Attention
Code Code Available 0Box for Mask and Mask for Box: weak losses for multi-task partially supervised learning Nov 26, 2024 Object object-detection
Code Code Available 0HSI-Drive v2.0: More Data for New Challenges in Scene Understanding for Autonomous Driving Nov 26, 2024 Autonomous Driving Image Segmentation
— Unverified 0Open-Vocabulary Octree-Graph for 3D Scene Understanding Nov 25, 2024 Object Scene Understanding
— Unverified 0An End-to-End Robust Point Cloud Semantic Segmentation Network with Single-Step Conditional Diffusion Models Nov 25, 2024 Denoising Scene Understanding
Code Code Available 2RoboSpatial: Teaching Spatial Understanding to 2D and 3D Vision-Language Models for Robotics Nov 25, 2024 Robot Manipulation Scene Understanding
— Unverified 0ROOT: VLM based System for Indoor Scene Understanding and Beyond Nov 24, 2024 Scene Generation Scene Understanding
Code Code Available 1UniGaussian: Driving Scene Reconstruction from Multiple Camera Models via Unified Gaussian Representations Nov 22, 2024 Autonomous Driving Scene Understanding
— Unverified 0Multimodal 3D Reasoning Segmentation with Complex Scenes Nov 21, 2024 Reasoning Segmentation Scene Understanding
— Unverified 0GaussianPretrain: A Simple Unified 3D Gaussian Representation for Visual Pre-training in Autonomous Driving Nov 19, 2024 3D Object Detection Autonomous Driving
Code Code Available 2Classification of Geographical Land Structure Using Convolution Neural Network and Transfer Learning Nov 19, 2024 Scene Understanding Transfer Learning
— Unverified 0Calibrated and Efficient Sampling-Free Confidence Estimation for LiDAR Scene Semantic Segmentation Nov 18, 2024 Autonomous Driving LIDAR Semantic Segmentation
— Unverified 0The ADUULM-360 Dataset -- A Multi-Modal Dataset for Depth Estimation in Adverse Weather Nov 18, 2024 Autonomous Driving Depth Estimation
Code Code Available 0Reducing Label Dependency for Underwater Scene Understanding: A Survey of Datasets, Techniques and Applications Nov 18, 2024 Scene Segmentation Scene Understanding
— Unverified 0MGNiceNet: Unified Monocular Geometric Scene Understanding Nov 18, 2024 Autonomous Driving Autonomous Vehicles
Code Code Available 0Memory-Augmented Multimodal LLMs for Surgical VQA via Self-Contained Inquiry Nov 17, 2024 Question Answering Scene Understanding
— Unverified 0MetricGold: Leveraging Text-To-Image Latent Diffusion Models for Metric Depth Estimation Nov 16, 2024 Depth Estimation Monocular Depth Estimation
Code Code Available 0Large Language Models (LLMs) as Traffic Control Systems at Urban Intersections: A New Paradigm Nov 16, 2024 Autonomous Vehicles Decision Making
— Unverified 0TESGNN: Temporal Equivariant Scene Graph Neural Networks for Efficient and Robust Multi-View 3D Scene Understanding Nov 15, 2024 Graph Matching Graph Neural Network
Code Code Available 1Content-Aware Preserving Image Generation Nov 15, 2024 Image Generation Scene Understanding
— Unverified 0OSMLoc: Single Image-Based Visual Localization in OpenStreetMap with Fused Geometric and Semantic Guidance Nov 13, 2024 Depth Estimation Monocular Depth Estimation
Code Code Available 2SE(3) Equivariant Ray Embeddings for Implicit Multi-View Depth Estimation Nov 11, 2024 Data Augmentation Decoder
— Unverified 0Graph-Based Multi-Modal Sensor Fusion for Autonomous Driving Nov 6, 2024 Autonomous Driving Multi-Object Tracking
— Unverified 0Modeling Uncertainty in 3D Gaussian Splatting through Continuous Semantic Splatting Nov 4, 2024 Scene Understanding Uncertainty Quantification
— Unverified 0Multi-task Geometric Estimation of Depth and Surface Normal from Monocular 360° Images Nov 4, 2024 Multi-Task Learning Scene Understanding
Code Code Available 0On Deep Learning for Geometric and Semantic Scene Understanding Using On-Vehicle 3D LiDAR Nov 1, 2024 3D Semantic Segmentation Autonomous Driving
Code Code Available 2Symbolic Graph Inference for Compound Scene Understanding Oct 30, 2024 Question Answering Scene Understanding
— Unverified 0UniRiT: Towards Few-Shot Non-Rigid Point Cloud Registration Oct 30, 2024 Point Cloud Registration Representation Learning
— Unverified 0Senna: Bridging Large Vision-Language Models and End-to-End Autonomous Driving Oct 29, 2024 Autonomous Driving Scene Understanding
Code Code Available 4