| LoMOE: Localized Multi-Object Editing via Multi-Diffusion | Mar 1, 2024 | Object | —Unverified | 0 |
| HALC: Object Hallucination Reduction via Adaptive Focal-Contrast Decoding | Mar 1, 2024 | HallucinationObject | CodeCode Available | 2 |
| Learning Causal Features for Incremental Object Detection | Mar 1, 2024 | Incremental LearningObject | —Unverified | 0 |
| FusionVision: A comprehensive approach of 3D object reconstruction and segmentation from RGB-D cameras using YOLO and fast segment anything | Feb 29, 2024 | 3D Object ReconstructionInstance Segmentation | CodeCode Available | 2 |
| Privacy-Preserving Autoencoder for Collaborative Object Detection | Feb 29, 2024 | License Plate RecognitionObject | CodeCode Available | 0 |
| DOZE: A Dataset for Open-Vocabulary Zero-Shot Object Navigation in Dynamic Environments | Feb 29, 2024 | AttributeCollision Avoidance | —Unverified | 0 |
| A Simple yet Effective Network based on Vision Transformer for Camouflaged Object and Salient Object Detection | Feb 29, 2024 | Objectobject-detection | CodeCode Available | 1 |
| Debiased Novel Category Discovering and Localization | Feb 29, 2024 | Contrastive LearningNovel Class Discovery | —Unverified | 0 |
| Boosting Semi-Supervised Object Detection in Remote Sensing Images With Active Teaching | Feb 29, 2024 | Active LearningObject | —Unverified | 0 |
| ProtoP-OD: Explainable Object Detection with Prototypical Parts | Feb 29, 2024 | Objectobject-detection | —Unverified | 0 |
| SeMoLi: What Moves Together Belongs Together | Feb 29, 2024 | ClusteringObject | —Unverified | 0 |
| Aligning Knowledge Graph with Visual Perception for Object-goal Navigation | Feb 29, 2024 | Object | CodeCode Available | 1 |
| Spatial Coherence Loss: All Objects Matter in Salient and Camouflaged Object Detection | Feb 28, 2024 | AllObject | —Unverified | 0 |
| EchoTrack: Auditory Referring Multi-Object Tracking for Autonomous Driving | Feb 28, 2024 | Autonomous DrivingMulti-Object Tracking | CodeCode Available | 1 |
| Prompt-Driven Dynamic Object-Centric Learning for Single Domain Generalization | Feb 28, 2024 | Domain Generalizationimage-classification | —Unverified | 0 |
| Zero-Shot Aerial Object Detection with Visual Description Regularization | Feb 28, 2024 | Objectobject-detection | CodeCode Available | 0 |
| A Multimodal Handover Failure Detection Dataset and Baselines | Feb 28, 2024 | Action SegmentationObject | CodeCode Available | 0 |
| Towards Unified 3D Object Detection via Algorithm and Data Unification | Feb 28, 2024 | 3D Object DetectionMonocular 3D Object Detection | —Unverified | 0 |
| Detection of Micromobility Vehicles in Urban Traffic Videos | Feb 28, 2024 | Objectobject-detection | CodeCode Available | 0 |
| ACTrack: Adding Spatio-Temporal Condition for Visual Object Tracking | Feb 27, 2024 | ObjectObject Tracking | —Unverified | 0 |
| OSCaR: Object State Captioning and State Change Representation | Feb 27, 2024 | Change DetectionObject | CodeCode Available | 1 |
| Deployment Prior Injection for Run-time Calibratable Object Detection | Feb 27, 2024 | Objectobject-detection | —Unverified | 0 |
| In Defense and Revival of Bayesian Filtering for Thermal Infrared Object Tracking | Feb 27, 2024 | ObjectObject Tracking | —Unverified | 0 |
| ShapeLLM: Universal 3D Object Understanding for Embodied Interaction | Feb 27, 2024 | 3D geometry3D Object Captioning | CodeCode Available | 3 |
| ADL4D: Towards A Contextually Rich Dataset for 4D Activities of Daily Living | Feb 27, 2024 | Action SegmentationObject | —Unverified | 0 |
| Parallelized Spatiotemporal Binding | Feb 26, 2024 | DecoderObject | —Unverified | 0 |
| HOISDF: Constraining 3D Hand-Object Pose Estimation with Global Signed Distance Fields | Feb 26, 2024 | 3D Hand Pose Estimationhand-object pose | CodeCode Available | 2 |
| PhyGrasp: Generalizing Robotic Grasping with Physics-informed Large Multimodal Models | Feb 26, 2024 | ObjectPhysical Commonsense Reasoning | —Unverified | 0 |
| SaRPFF: A Self-Attention with Register-based Pyramid Feature Fusion module for enhanced RLD detection | Feb 26, 2024 | Objectobject-detection | —Unverified | 0 |
| What Do Language Models Hear? Probing for Auditory Representations in Language Models | Feb 26, 2024 | Object | —Unverified | 0 |
| Outline-Guided Object Inpainting with Diffusion Models | Feb 26, 2024 | Image AugmentationInstance Segmentation | —Unverified | 0 |
| Real-Time Vehicle Detection and Urban Traffic Behavior Analysis Based on UAV Traffic Videos on Mobile Devices | Feb 26, 2024 | Objectobject-detection | —Unverified | 0 |
| Semi-supervised Open-World Object Detection | Feb 25, 2024 | Incremental LearningObject | CodeCode Available | 1 |
| Multi-Object Tracking by Hierarchical Visual Representations | Feb 24, 2024 | Multi-Object TrackingObject | —Unverified | 0 |
| CLIPose: Category-Level Object Pose Estimation with Pre-trained Vision-Language Knowledge | Feb 24, 2024 | Contrastive LearningLanguage Modelling | —Unverified | 0 |
| Detection Is Tracking: Point Cloud Multi-Sweep Deep Learning Models Revisited | Feb 24, 2024 | Autonomous DrivingObject | —Unverified | 0 |
| Exploring Failure Cases in Multimodal Reasoning About Physical Dynamics | Feb 24, 2024 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Seeing is Believing: Mitigating Hallucination in Large Vision-Language Models via CLIP-Guided Decoding | Feb 23, 2024 | HallucinationObject | CodeCode Available | 1 |
| Grasp, See, and Place: Efficient Unknown Object Rearrangement with Policy Structure Prior | Feb 23, 2024 | ObjectObject Rearrangement | CodeCode Available | 2 |
| Background Denoising for Ptychography via Wigner Distribution Deconvolution | Feb 23, 2024 | DenoisingObject | —Unverified | 0 |
| Object permanence in newborn chicks is robust against opposing evidence | Feb 22, 2024 | Object | —Unverified | 0 |
| Path Planning based on 2D Object Bounding-box | Feb 22, 2024 | Autonomous DrivingGraph Neural Network | —Unverified | 0 |
| Place Anything into Any Video | Feb 22, 2024 | 3D GenerationObject | —Unverified | 0 |
| YOLO-TLA: An Efficient and Lightweight Small Object Detection Model based on YOLOv5 | Feb 22, 2024 | Objectobject-detection | —Unverified | 0 |
| Learning Dual-arm Object Rearrangement for Cartesian Robots | Feb 21, 2024 | Computational EfficiencyObject | —Unverified | 0 |
| Unsupervised learning based object detection using Contrastive Learning | Feb 21, 2024 | Contrastive LearningObject | —Unverified | 0 |
| TransGOP: Transformer-Based Gaze Object Prediction | Feb 21, 2024 | Gaze EstimationObject | CodeCode Available | 1 |
| Weakly supervised localisation of prostate cancer using reinforcement learning for bi-parametric MR images | Feb 21, 2024 | Multiple Instance LearningObject | —Unverified | 0 |
| VOOM: Robust Visual Object Odometry and Mapping using Hierarchical Landmarks | Feb 21, 2024 | Computational EfficiencyObject | CodeCode Available | 2 |
| DINOBot: Robot Manipulation via Retrieval and Alignment with Vision Foundation Models | Feb 20, 2024 | Imitation LearningObject | —Unverified | 0 |