General Geometry-aware Weakly Supervised 3D Object Detection Jul 18, 2024 3D Object Detection Object
Code Code Available 1Dual-Hybrid Attention Network for Specular Highlight Removal Jul 17, 2024 highlight removal Object Recognition
Code Code Available 1InfoNorm: Mutual Information Shaping of Normals for Sparse-View Reconstruction Jul 17, 2024 Scene Understanding Surface Reconstruction
Code Code Available 0Benchmarking Vision Language Models for Cultural Understanding Jul 15, 2024 Benchmarking Question Answering
— Unverified 0No Train, all Gain: Self-Supervised Gradients Improve Deep Frozen Representations Jul 15, 2024 All Image Retrieval
Code Code Available 1Shape2Scene: 3D Scene Representation Learning Through Pre-training on Shape Data Jul 14, 2024 3D Object Detection 3D Semantic Segmentation
Code Code Available 0Dense Multimodal Alignment for Open-Vocabulary 3D Scene Understanding Jul 13, 2024 Scene Understanding Zero-Shot Learning
— Unverified 0BLOS-BEV: Navigation Map Enhanced Lane Segmentation Network, Beyond Line of Sight Jul 11, 2024 Autonomous Driving BEV Segmentation
— Unverified 0Pareto Low-Rank Adapters: Efficient Multi-Task Learning with Preferences Jul 10, 2024 Multi-Task Learning Scene Understanding
— Unverified 0Swiss DINO: Efficient and Versatile Vision Framework for On-device Personal Object Search Jul 10, 2024 Few-Shot Learning GPU
Code Code Available 0LVLM-empowered Multi-modal Representation Learning for Visual Place Recognition Jul 9, 2024 Instruction Following Representation Learning
— Unverified 0Joint prototype and coefficient prediction for 3D instance segmentation Jul 9, 2024 3D Instance Segmentation Instance Segmentation
— Unverified 0Self-supervised Learning via Cluster Distance Prediction for Operating Room Context Awareness Jul 7, 2024 Activity Recognition Scene Understanding
— Unverified 0Hybrid Primal Sketch: Combining Analogy, Qualitative Representations, and Computer Vision for Scene Understanding Jul 5, 2024 Scene Understanding
— Unverified 0A Unified Framework for 3D Scene Understanding Jul 3, 2024 Contrastive Learning Knowledge Distillation
Code Code Available 2MTMamba: Enhancing Multi-Task Dense Scene Understanding by Mamba-Based Decoders Jul 2, 2024 Boundary Detection Human Parsing
Code Code Available 1Uni-DVPS: Unified Model for Depth-Aware Video Panoptic Segmentation Jul 1, 2024 Autonomous Driving Decoder
Code Code Available 1PanopticRecon: Leverage Open-vocabulary Instance Segmentation for Zero-shot Panoptic Reconstruction Jul 1, 2024 3D Panoptic Segmentation Instance Segmentation
— Unverified 0CSFNet: A Cosine Similarity Fusion Network for Real-Time RGB-X Semantic Segmentation of Driving Scenes Jul 1, 2024 Autonomous Vehicles Image Segmentation
Code Code Available 1ESGNN: Towards Equivariant Scene Graph Neural Network for 3D Scene Understanding Jun 30, 2024 Graph Generation Graph Neural Network
— Unverified 0EgoGaussian: Dynamic Scene Understanding from Egocentric Video with 3D Gaussian Splatting Jun 28, 2024 Human-Object Interaction Detection Object
— Unverified 0PPTFormer: Pseudo Multi-Perspective Transformer for UAV Segmentation Jun 28, 2024 Decoder Image Segmentation
— Unverified 03D-MVP: 3D Multiview Pretraining for Robotic Manipulation Jun 26, 2024 Decoder Robot Manipulation
— Unverified 0GPT-4V Explorations: Mining Autonomous Driving Jun 24, 2024 Autonomous Driving Decision Making
— Unverified 0AudioBench: A Universal Benchmark for Audio Large Language Models Jun 23, 2024 Audio Scene Understanding Instruction Following
Code Code Available 3EvSegSNN: Neuromorphic Semantic Segmentation for Event Data Jun 20, 2024 Autonomous Vehicles Decoder
— Unverified 0StableSemantics: A Synthetic Language-Vision Dataset of Semantic Representations in Naturalistic Images Jun 19, 2024 Object Recognition Scene Understanding
Code Code Available 2DistillNeRF: Perceiving 3D Scenes from Single-Glance Images by Distilling Neural Fields and Foundation Model Features Jun 17, 2024 3D geometry 3D Semantic Occupancy Prediction
— Unverified 0Enhancing Generalizability of Representation Learning for Data-Efficient 3D Scene Understanding Jun 17, 2024 3D Object Detection 3D Semantic Segmentation
— Unverified 0MapVision: CVPR 2024 Autonomous Grand Challenge Mapless Driving Tech Report Jun 14, 2024 Autonomous Driving Scene Understanding
— Unverified 0A Two-Stage Masked Autoencoder Based Network for Indoor Depth Completion Jun 14, 2024 3D Reconstruction Autonomous Driving
Code Code Available 1MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding Jun 13, 2024 Multiple-choice Scene Understanding
Code Code Available 1Category-level Neural Field for Reconstruction of Partially Observed Objects in Indoor Environment Jun 12, 2024 3D Reconstruction Scene Understanding
Code Code Available 0RS-Agent: Automating Remote Sensing Tasks through Intelligent Agent Jun 11, 2024 AI Agent Descriptive
Code Code Available 2FastLGS: Speeding up Language Embedded Gaussians with Feature Grid Mapping Jun 4, 2024 3DGS Scene Understanding
— Unverified 0EAGLE: Efficient Adaptive Geometry-based Learning in Cross-view Understanding Jun 3, 2024 Domain Adaptation Open Vocabulary Semantic Segmentation
— Unverified 0Object Aware Egocentric Online Action Detection Jun 3, 2024 Action Detection Object
— Unverified 0CYCLO: Cyclic Graph Transformer Approach to Multi-Object Relationship Modeling in Aerial Videos Jun 3, 2024 Graph Generation Scene Graph Generation
— Unverified 0Semi-supervised Video Semantic Segmentation Using Unreliable Pseudo Labels for PVUW2024 Jun 2, 2024 Scene Parsing Scene Understanding
— Unverified 0SAM-E: Leveraging Visual Foundation Model with Sequence Imitation for Embodied Manipulation May 30, 2024 Instruction Following parameter-efficient fine-tuning
— Unverified 0Learning 3D Robotics Perception using Inductive Priors May 30, 2024 3D Reconstruction Image Generation
— Unverified 0Kestrel: Point Grounding Multimodal LLM for Part-Aware 3D Vision-Language Understanding May 29, 2024 Scene Understanding Segmentation
— Unverified 0GOI: Find 3D Gaussians of Interest with an Optimizable Open-vocabulary Semantic-space Hyperplane May 27, 2024 3DGS feature selection
— Unverified 0Open-Vocabulary SAM3D: Towards Training-free Open-Vocabulary 3D Scene Understanding May 24, 2024 Scene Understanding Zero Shot Segmentation
— Unverified 0Generative Camera Dolly: Extreme Monocular Dynamic Novel View Synthesis May 23, 2024 Novel View Synthesis Scene Understanding
— Unverified 0Transformers for Image-Goal Navigation May 23, 2024 Navigate Scene Understanding
— Unverified 0CoPeD-Advancing Multi-Robot Collaborative Perception: A Comprehensive Dataset in Real-World Environments May 23, 2024 Pose Estimation Scene Understanding
Code Code Available 1TS40K: a 3D Point Cloud Dataset of Rural Terrain and Electrical Transmission System May 22, 2024 3D Object Detection 3D Semantic Segmentation
— Unverified 0GameVLM: A Decision-making Framework for Robotic Task Planning Based on Visual Language Models and Zero-sum Games May 22, 2024 Code Generation Decision Making
— Unverified 0Anticipating Object State Changes in Long Procedural Videos May 21, 2024 Object Object State Change Classification
— Unverified 0