ASI-Seg: Audio-Driven Surgical Instrument Segmentation with Surgeon Intention Understanding Jul 28, 2024 Contrastive Learning Intention-oriented Segmentation
Code Code Available 05 LoST? Appearance-Invariant Place Recognition for Opposite Viewpoints using Visual Semantics Apr 16, 2018 Navigate Scene Understanding
Code Code Available 05 Lightweight integration of 3D features to improve 2D image segmentation Dec 16, 2022 Image Segmentation Scene Understanding
Code Code Available 05 Placental Vessel Segmentation and Registration in Fetoscopy: Literature Review and MICCAI FetReg2021 Challenge Findings Jun 24, 2022 Scene Understanding Semantic Segmentation
Code Code Available 05 Leveraging Acoustic Images for Effective Self-Supervised Audio Representation Learning Aug 1, 2020 Cross-Modal Retrieval Representation Learning
Code Code Available 05 Leveraging Automatic CAD Annotations for Supervised Learning in 3D Scene Understanding Apr 18, 2025 Deep Learning Point Cloud Completion
Code Code Available 05 Fast Scene Understanding for Autonomous Driving Aug 8, 2017 Autonomous Driving Decoder
Code Code Available 05 Artificial Color Constancy via GoogLeNet with Angular Loss Function Nov 20, 2018 Color Constancy Object Recognition
Code Code Available 05 Learning Rigidity in Dynamic Scenes with a Moving Camera for 3D Motion Field Estimation Apr 12, 2018 Optical Flow Estimation Scene Flow Estimation
Code Code Available 05 Learning Panoptic Segmentation from Instance Contours Oct 16, 2020 Clustering Instance Segmentation
Code Code Available 05 CLAIR-A: Leveraging Large Language Models to Judge Audio Captions Sep 19, 2024 Audio captioning Language Modeling
Code Code Available 05 False Negative Reduction in Video Instance Segmentation using Uncertainty Estimates Jun 28, 2021 Depth Estimation Instance Segmentation
Code Code Available 05 Implicit Background Estimation for Semantic Segmentation May 23, 2019 Scene Understanding Segmentation
Code Code Available 05 Learning Regional Purity for Instance Segmentation on 3D Point Clouds Nov 3, 2020 3D Instance Segmentation 3D Semantic Segmentation
Code Code Available 05 Learning Monocular Depth by Distilling Cross-domain Stereo Networks Aug 20, 2018 Autonomous Driving Depth Estimation
Code Code Available 05 Facing the Void: Overcoming Missing Data in Multi-View Imagery May 21, 2022 Classification image-classification
Code Code Available 05 Learning from Videos for 3D World: Enhancing MLLMs with 3D Vision Geometry Priors May 30, 2025 3D geometry Large Language Model
Code Code Available 05 Extremely Fine-Grained Visual Classification over Resembling Glyphs in the Wild Aug 25, 2024 Contrastive Learning Fine-Grained Image Classification
Code Code Available 05 Adversarial Attacks on Monocular Pose Estimation Jul 14, 2022 Depth Estimation Monocular Depth Estimation
Code Code Available 05 Language-based Colorization of Scene Sketches Nov 17, 2019 Colorization Image Generation
Code Code Available 05 Exploring Scene Affinity for Semi-Supervised LiDAR Semantic Segmentation Aug 21, 2024 3D Semantic Segmentation Data Augmentation
Code Code Available 05 Knowledge-Guided Object Discovery with Acquired Deep Impressions Mar 19, 2021 Object Object Discovery
Code Code Available 05 Label-Attention Transformer with Geometrically Coherent Objects for Image Captioning Sep 16, 2021 Decoder Image Captioning
Code Code Available 05 LoCATe-GAT: Modeling Multi-Scale Local Context and Action Relationships for Zero-Shot Action Recognition Nov 27, 2024 Action Recognition Graph Attention
Code Code Available 05 Monocular 3D Object Detection with Pseudo-LiDAR Point Cloud Mar 23, 2019 3D Object Detection Depth Estimation
Code Code Available 05 Deep Reinforcement Learning on a Budget: 3D Control and Reasoning Without a Supercomputer Apr 3, 2019 Deep Reinforcement Learning Reinforcement Learning
Code Code Available 05 Benchmarking Feature Upsampling Methods for Vision Foundation Models using Interactive Segmentation May 4, 2025 Benchmarking Feature Upsampling
Code Code Available 05 Interpretable Visual Understanding with Cognitive Attention Network Aug 6, 2021 Scene Understanding Visual Commonsense Reasoning
Code Code Available 05 P2AT: Pyramid Pooling Axial Transformer for Real-time Semantic Segmentation Oct 23, 2023 Autonomous Driving Decoder
Code Code Available 05 Single Image 3D Object Estimation with Primitive Graph Networks Sep 9, 2021 Graph Neural Network Object
Code Code Available 05 Exploiting Temporal Coherence for Multi-modal Video Categorization Feb 7, 2020 object-detection Object Detection
— Unverified 00 Exploiting High Level Scene Cues in Stereo Reconstruction Dec 1, 2015 3D Reconstruction Scene Understanding
— Unverified 00 Explicit3D: Graph Network with Spatial Inference for Single Image 3D Object Detection Feb 13, 2023 3D Object Detection Graph Generation
— Unverified 00 Challenges for Monocular 6D Object Pose Estimation in Robotics Jul 22, 2023 6D Pose Estimation using RGB Object
— Unverified 00 ArK: Augmented Reality with Knowledge Interactive Emergent Ability May 1, 2023 AI Agent Mixed Reality
— Unverified 00 Explainable Scene Understanding with Qualitative Representations and Graph Neural Networks Apr 17, 2025 Autonomous Driving Scene Understanding
— Unverified 00 Expanding Frozen Vision-Language Models without Retraining: Towards Improved Robot Perception Aug 31, 2023 Activity Recognition Human Activity Recognition
— Unverified 00 Exosense: A Vision-Based Scene Understanding System For Exoskeletons Mar 21, 2024 Language Modelling Motion Planning
— Unverified 00 Argus: Leveraging Multiview Images for Improved 3-D Scene Understanding With Large Language Models Jul 17, 2025 3D Point Cloud Reconstruction Point cloud reconstruction
— Unverified 00 Adversarial Attacks on Monocular Depth Estimation Mar 23, 2020 Autonomous Driving Depth Estimation
— Unverified 00 ExCap3D: Expressive 3D Scene Understanding via Object Captioning with Varying Detail Mar 21, 2025 Object Scene Understanding
— Unverified 00 EvSegSNN: Neuromorphic Semantic Segmentation for Event Data Jun 20, 2024 Autonomous Vehicles Decoder
— Unverified 00 EvidMTL: Evidential Multi-Task Learning for Uncertainty-Aware Semantic Surface Mapping from Monocular RGB Images Mar 6, 2025 Depth Estimation Depth Prediction
— Unverified 00 Every SAM Drop Counts: Embracing Semantic Priors for Multi-Modality Image Fusion and Beyond Mar 3, 2025 Infrared And Visible Image Fusion Scene Understanding
— Unverified 00 Event fields: Capturing light fields at high speed, resolution, and dynamic range Dec 9, 2024 Depth Estimation Scene Understanding
— Unverified 00 Category-Level and Open-Set Object Pose Estimation for Robotics Apr 28, 2025 6D Pose Estimation 6D Pose Estimation using RGB
— Unverified 00 Evaluation of Multimodal Semantic Segmentation using RGB-D Data Mar 31, 2021 Scene Understanding Semantic Segmentation
— Unverified 00 Catch Me if You Can: A Novel Task for Detection of Covert Geo-Locations (CGL) Feb 5, 2022 object-detection Object Detection
— Unverified 00 A Review on Visual-SLAM: Advancements from Geometric Modelling to Learning-based Semantic Scene Understanding Sep 12, 2022 Scene Understanding
— Unverified 00 Advancing the Understanding of Fine-Grained 3D Forest Structures using Digital Cousins and Simulation-to-Reality: Methods and Datasets Jan 7, 2025 Data Augmentation parameter estimation
— Unverified 00