3D-VirtFusion: Synthetic 3D Data Augmentation through Generative Diffusion Models and Controllable Editing Aug 25, 2024 Data Augmentation Diversity
— Unverified 0Extremely Fine-Grained Visual Classification over Resembling Glyphs in the Wild Aug 25, 2024 Contrastive Learning Fine-Grained Image Classification
Code Code Available 0Exploring Scene Affinity for Semi-Supervised LiDAR Semantic Segmentation Aug 21, 2024 3D Semantic Segmentation Data Augmentation
Code Code Available 0Near, far: Patch-ordering enhances vision foundation models' scene understanding Aug 20, 2024 GPU Scene Understanding
— Unverified 03D-Aware Instance Segmentation and Tracking in Egocentric Videos Aug 19, 2024 3D Object Reconstruction Instance Segmentation
— Unverified 0SceneGPT: A Language Model for 3D Scene Understanding Aug 13, 2024 In-Context Learning Language Modeling
— Unverified 0SpectralGaussians: Semantic, spectral 3D Gaussian splatting for multi-spectral scene representation, visualization and analysis Aug 13, 2024 3DGS Scene Understanding
— Unverified 0HeLiMOS: A Dataset for Moving Object Segmentation in 3D Point Clouds From Heterogeneous LiDAR Sensors Aug 12, 2024 Scene Understanding Semantic Segmentation
— Unverified 0Spherical World-Locking for Audio-Visual Localization in Egocentric Videos Aug 9, 2024 Active Speaker Localization Decoder
— Unverified 0Complete 3d relationships extraction modality alignment network for 3d dense captioning Aug 1, 2024 3D dense captioning 3D Object Detection
— Unverified 0DEF-oriCORN: efficient 3D scene understanding for robust language-directed manipulation without demonstrations Jul 31, 2024 Motion Planning Scene Understanding
— Unverified 0A Plug-and-Play Method for Rare Human-Object Interactions Detection by Bridging Domain Gap Jul 31, 2024 Human-Object Interaction Detection Image Reconstruction
Code Code Available 0NIS-SLAM: Neural Implicit Semantic RGB-D SLAM for 3D Consistent Scene Understanding Jul 30, 2024 Scene Understanding Simultaneous Localization and Mapping
— Unverified 0From Feature Importance to Natural Language Explanations Using LLMs with RAG Jul 30, 2024 counterfactual Counterfactual Reasoning
Code Code Available 0Rethinking RGB-D Fusion for Semantic Segmentation in Surgical Datasets Jul 29, 2024 Decoder Scene Understanding
— Unverified 0ASI-Seg: Audio-Driven Surgical Instrument Segmentation with Surgeon Intention Understanding Jul 28, 2024 Contrastive Learning Intention-oriented Segmentation
Code Code Available 0GP-VLS: A general-purpose vision language model for surgery Jul 27, 2024 Language Modeling Language Modelling
— Unverified 0Answerability Fields: Answerable Location Estimation via Diffusion Models Jul 26, 2024 Question Answering Scene Understanding
— Unverified 03D Question Answering for City Scene Understanding Jul 24, 2024 Autonomous Driving Question Answering
— Unverified 0Augmented Efficiency: Reducing Memory Footprint and Accelerating Inference for 3D Semantic Segmentation through Hybrid Vision Jul 23, 2024 2D Semantic Segmentation 3D Semantic Segmentation
— Unverified 0InLUT3D: Challenging real indoor dataset for point cloud analysis Jul 22, 2024 Benchmarking Scene Understanding
— Unverified 0VideoGameBunny: Towards vision assistants for video games Jul 21, 2024 Image Captioning Scene Understanding
— Unverified 0A New Lightweight Hybrid Graph Convolutional Neural Network -- CNN Scheme for Scene Classification using Object Detection Inference Jul 19, 2024 Autonomous Vehicles object-detection
Code Code Available 0OpenSU3D: Open World 3D Scene Understanding using Foundation Models Jul 19, 2024 Scene Understanding Spatial Reasoning
— Unverified 0MC-PanDA: Mask Confidence for Panoptic Domain Adaptation Jul 19, 2024 Domain Adaptation Panoptic Segmentation
Code Code Available 0GaussianBeV: 3D Gaussian Representation meets Perception Models for BeV Segmentation Jul 19, 2024 BEV Segmentation Scene Understanding
— Unverified 0Open Vocabulary 3D Scene Understanding via Geometry Guided Self-Distillation Jul 18, 2024 Knowledge Distillation Representation Learning
— Unverified 0Training-Free Model Merging for Multi-target Domain Adaptation Jul 18, 2024 Domain Adaptation Multi-target Domain Adaptation
— Unverified 0InfoNorm: Mutual Information Shaping of Normals for Sparse-View Reconstruction Jul 17, 2024 Scene Understanding Surface Reconstruction
Code Code Available 0Benchmarking Vision Language Models for Cultural Understanding Jul 15, 2024 Benchmarking Question Answering
— Unverified 0Shape2Scene: 3D Scene Representation Learning Through Pre-training on Shape Data Jul 14, 2024 3D Object Detection 3D Semantic Segmentation
Code Code Available 0Dense Multimodal Alignment for Open-Vocabulary 3D Scene Understanding Jul 13, 2024 Scene Understanding Zero-Shot Learning
— Unverified 0BLOS-BEV: Navigation Map Enhanced Lane Segmentation Network, Beyond Line of Sight Jul 11, 2024 Autonomous Driving BEV Segmentation
— Unverified 0Pareto Low-Rank Adapters: Efficient Multi-Task Learning with Preferences Jul 10, 2024 Multi-Task Learning Scene Understanding
— Unverified 0Swiss DINO: Efficient and Versatile Vision Framework for On-device Personal Object Search Jul 10, 2024 Few-Shot Learning GPU
Code Code Available 0LVLM-empowered Multi-modal Representation Learning for Visual Place Recognition Jul 9, 2024 Instruction Following Representation Learning
— Unverified 0Joint prototype and coefficient prediction for 3D instance segmentation Jul 9, 2024 3D Instance Segmentation Instance Segmentation
— Unverified 0Self-supervised Learning via Cluster Distance Prediction for Operating Room Context Awareness Jul 7, 2024 Activity Recognition Scene Understanding
— Unverified 0Hybrid Primal Sketch: Combining Analogy, Qualitative Representations, and Computer Vision for Scene Understanding Jul 5, 2024 Scene Understanding
— Unverified 0PanopticRecon: Leverage Open-vocabulary Instance Segmentation for Zero-shot Panoptic Reconstruction Jul 1, 2024 3D Panoptic Segmentation Instance Segmentation
— Unverified 0ESGNN: Towards Equivariant Scene Graph Neural Network for 3D Scene Understanding Jun 30, 2024 Graph Generation Graph Neural Network
— Unverified 0EgoGaussian: Dynamic Scene Understanding from Egocentric Video with 3D Gaussian Splatting Jun 28, 2024 Human-Object Interaction Detection Object
— Unverified 0PPTFormer: Pseudo Multi-Perspective Transformer for UAV Segmentation Jun 28, 2024 Decoder Image Segmentation
— Unverified 03D-MVP: 3D Multiview Pretraining for Robotic Manipulation Jun 26, 2024 Decoder Robot Manipulation
— Unverified 0GPT-4V Explorations: Mining Autonomous Driving Jun 24, 2024 Autonomous Driving Decision Making
— Unverified 0EvSegSNN: Neuromorphic Semantic Segmentation for Event Data Jun 20, 2024 Autonomous Vehicles Decoder
— Unverified 0Enhancing Generalizability of Representation Learning for Data-Efficient 3D Scene Understanding Jun 17, 2024 3D Object Detection 3D Semantic Segmentation
— Unverified 0DistillNeRF: Perceiving 3D Scenes from Single-Glance Images by Distilling Neural Fields and Foundation Model Features Jun 17, 2024 3D geometry 3D Semantic Occupancy Prediction
— Unverified 0MapVision: CVPR 2024 Autonomous Grand Challenge Mapless Driving Tech Report Jun 14, 2024 Autonomous Driving Scene Understanding
— Unverified 0Category-level Neural Field for Reconstruction of Partially Observed Objects in Indoor Environment Jun 12, 2024 3D Reconstruction Scene Understanding
Code Code Available 0