| METOR: A Unified Framework for Mutual Enhancement of Objects and Relationships in Open-vocabulary Video Visual Relationship Detection | May 10, 2025 | Objectobject-detection | CodeCode Available | 0 |
| Underwater object detection in sonar imagery with detection transformer and Zero-shot neural architecture search | May 10, 2025 | Neural Architecture SearchObject | —Unverified | 0 |
| An Edge AI Solution for Space Object Detection | May 8, 2025 | Deep LearningObject | —Unverified | 0 |
| Enhancing Satellite Object Localization with Dilated Convolutions and Attention-aided Spatial Pooling | May 8, 2025 | feature selectionObject | CodeCode Available | 0 |
| A Simple Detector with Frame Dynamics is a Strong Tracker | May 8, 2025 | Objectobject-detection | CodeCode Available | 1 |
| Visual Affordances: Enabling Robots to Understand Object Functionality | May 8, 2025 | ObjectPrediction | —Unverified | 0 |
| PaniCar: Securing the Perception of Advanced Driving Assistance Systems Against Emergency Vehicle Lighting | May 8, 2025 | Autonomous VehiclesFlare Removal | —Unverified | 0 |
| MDE-Edit: Masked Dual-Editing for Multi-Object Image Editing via Diffusion Models | May 8, 2025 | AttributeImage Manipulation | —Unverified | 0 |
| Web2Grasp: Learning Functional Grasps from Web Images of Hand-Object Interactions | May 7, 2025 | Object | —Unverified | 0 |
| Low Resolution Next Best View for Robot Packing | May 7, 2025 | 3D ReconstructionObject | —Unverified | 0 |
| One2Any: One-Reference 6D Pose Estimation for Any Object | May 7, 2025 | 6D Pose Estimation6D Pose Estimation using RGB | —Unverified | 0 |
| CountDiffusion: Text-to-Image Synthesis with Training-Free Counting-Guidance Diffusion | May 7, 2025 | DenoisingImage Generation | —Unverified | 0 |
| AS3D: 2D-Assisted Cross-Modal Understanding with Semantic-Spatial Scene Graphs for 3D Visual Grounding | May 7, 2025 | 3D visual groundingGraph Attention | CodeCode Available | 0 |
| Corner Cases: How Size and Position of Objects Challenge ImageNet-Trained Models | May 6, 2025 | ObjectPosition | —Unverified | 0 |
| EOPose : Exemplar-based object reposing using Generalized Pose Correspondences | May 6, 2025 | ObjectSSIM | —Unverified | 0 |
| Null Counterfactual Factor Interactions for Goal-Conditioned Reinforcement Learning | May 6, 2025 | counterfactualObject | —Unverified | 0 |
| Sim2Real Transfer for Vision-Based Grasp Verification | May 5, 2025 | Objectobject-detection | CodeCode Available | 0 |
| Hierarchical Compact Clustering Attention (COCA) for Unsupervised Object-Centric Learning | May 4, 2025 | ClusteringDecoder | —Unverified | 0 |
| RESAnything: Attribute Prompting for Arbitrary Referring Segmentation | May 3, 2025 | AttributeImage Segmentation | —Unverified | 0 |
| Probabilistic Interactive 3D Segmentation with Hierarchical Neural Processes | May 3, 2025 | ObjectSegmentation | —Unverified | 0 |
| FreeInsert: Disentangled Text-Guided Object Insertion in 3D Gaussian Scene without Spatial Priors | May 2, 2025 | ObjectSpatial Reasoning | —Unverified | 0 |
| CDFormer: Cross-Domain Few-Shot Object Detection Transformer Against Feature Confusion | May 2, 2025 | Cross-Domain Few-ShotCross-Domain Few-Shot Object Detection | CodeCode Available | 1 |
| Inconsistency-based Active Learning for LiDAR Object Detection | May 1, 2025 | Active LearningAutonomous Driving | —Unverified | 0 |
| HeAL3D: Heuristical-enhanced Active Learning for 3D Object Detection | May 1, 2025 | 3D Object DetectionActive Learning | —Unverified | 0 |
| Enhancing Self-Supervised Fine-Grained Video Object Tracking with Dynamic Memory Prediction | Apr 30, 2025 | Decision MakingObject | —Unverified | 0 |
| Stereo X-ray tomography on deformed object tracking | Apr 30, 2025 | ObjectObject Tracking | —Unverified | 0 |
| MoSAM: Motion-Guided Segment Anything Model with Spatial-Temporal Memory Selection | Apr 30, 2025 | Instance SegmentationInteractive Segmentation | —Unverified | 0 |
| Learning to Borrow Features for Improved Detection of Small Objects in Single-Shot Detectors | Apr 30, 2025 | DescriptiveObject | —Unverified | 0 |
| DOPE: Dual Object Perception-Enhancement Network for Vision-and-Language Navigation | Apr 30, 2025 | NavigateObject | —Unverified | 0 |
| LLM-Empowered Embodied Agent for Memory-Augmented Task Planning in Household Robotics | Apr 30, 2025 | In-Context LearningObject | CodeCode Available | 1 |
| Black-Box Visual Prompt Engineering for Mitigating Object Hallucination in Large Vision Language Models | Apr 30, 2025 | HallucinationObject | —Unverified | 0 |
| The Mean of Multi-Object Trajectories | Apr 29, 2025 | Multi-Object TrackingObject | —Unverified | 0 |
| Hierarchical Context Learning of object components for unsupervised semantic segmentation | Apr 29, 2025 | ObjectSelf-Supervised Learning | CodeCode Available | 0 |
| Category-Level and Open-Set Object Pose Estimation for Robotics | Apr 28, 2025 | 6D Pose Estimation6D Pose Estimation using RGB | —Unverified | 0 |
| LM-MCVT: A Lightweight Multi-modal Multi-view Convolutional-Vision Transformer Approach for 3D Object Recognition | Apr 27, 2025 | 3D Object RecognitionObject | —Unverified | 0 |
| Dexonomy: Synthesizing All Dexterous Grasp Types in a Grasp Taxonomy | Apr 26, 2025 | AllObject | —Unverified | 0 |
| A Review of 3D Object Detection with Vision-Language Models | Apr 25, 2025 | 3D Object DetectionObject | —Unverified | 0 |
| Multi-Sensor Fusion of Active and Passive Measurements for Extended Object Tracking | Apr 25, 2025 | ObjectObject Tracking | —Unverified | 0 |
| PCF-Grasp: Converting Point Completion to Geometry Feature to Enhance 6-DoF Grasp | Apr 22, 2025 | Object | —Unverified | 0 |
| Object Learning and Robust 3D Reconstruction | Apr 22, 2025 | 3D ReconstructionObject | —Unverified | 0 |
| DeepPD: Joint Phase and Object Estimation from Phase Diversity with Neural Calibration of a Deformable Mirror | Apr 19, 2025 | DiversityObject | —Unverified | 0 |
| HMPE:HeatMap Embedding for Efficient Transformer-Based Small Object Detection | Apr 18, 2025 | DecoderFeature Engineering | —Unverified | 0 |
| Few-Shot Referring Video Single- and Multi-Object Segmentation via Cross-Modal Affinity with Instance Sequence Matching | Apr 18, 2025 | ObjectReferring Video Object Segmentation | CodeCode Available | 0 |
| Visual Intention Grounding for Egocentric Assistants | Apr 18, 2025 | ObjectVisual Grounding | —Unverified | 0 |
| SAR Object Detection with Self-Supervised Pretraining and Curriculum-Aware Sampling | Apr 17, 2025 | Disaster ResponseObject | —Unverified | 0 |
| VLLFL: A Vision-Language Model Based Lightweight Federated Learning Framework for Smart Agriculture | Apr 17, 2025 | Federated LearningLanguage Modeling | —Unverified | 0 |
| RF-DETR Object Detection vs YOLOv12 : A Study of Transformer-based and CNN-based Architectures for Single-Class and Multi-Class Greenfruit Detection in Complex Orchard Environments Under Label Ambiguity | Apr 17, 2025 | Computational EfficiencyObject | —Unverified | 0 |
| Crossing the Human-Robot Embodiment Gap with Sim-to-Real RL using One Human Demonstration | Apr 17, 2025 | Data AugmentationHuman-Object Interaction Detection | —Unverified | 0 |
| ViTa-Zero: Zero-shot Visuotactile Object 6D Pose Estimation | Apr 17, 2025 | 6D Pose Estimationhand-object pose | —Unverified | 0 |
| HiScene: Creating Hierarchical 3D Scenes with Isometric View Generation | Apr 17, 2025 | 3D GenerationImage Generation | —Unverified | 0 |