| FreeInsert: Disentangled Text-Guided Object Insertion in 3D Gaussian Scene without Spatial Priors | May 2, 2025 | ObjectSpatial Reasoning | —Unverified | 0 |
| Inconsistency-based Active Learning for LiDAR Object Detection | May 1, 2025 | Active LearningAutonomous Driving | —Unverified | 0 |
| HeAL3D: Heuristical-enhanced Active Learning for 3D Object Detection | May 1, 2025 | 3D Object DetectionActive Learning | —Unverified | 0 |
| Learning to Borrow Features for Improved Detection of Small Objects in Single-Shot Detectors | Apr 30, 2025 | DescriptiveObject | —Unverified | 0 |
| Enhancing Self-Supervised Fine-Grained Video Object Tracking with Dynamic Memory Prediction | Apr 30, 2025 | Decision MakingObject | —Unverified | 0 |
| Stereo X-ray tomography on deformed object tracking | Apr 30, 2025 | ObjectObject Tracking | —Unverified | 0 |
| DOPE: Dual Object Perception-Enhancement Network for Vision-and-Language Navigation | Apr 30, 2025 | NavigateObject | —Unverified | 0 |
| Black-Box Visual Prompt Engineering for Mitigating Object Hallucination in Large Vision Language Models | Apr 30, 2025 | HallucinationObject | —Unverified | 0 |
| MoSAM: Motion-Guided Segment Anything Model with Spatial-Temporal Memory Selection | Apr 30, 2025 | Instance SegmentationInteractive Segmentation | —Unverified | 0 |
| Hierarchical Context Learning of object components for unsupervised semantic segmentation | Apr 29, 2025 | ObjectSelf-Supervised Learning | CodeCode Available | 0 |
| The Mean of Multi-Object Trajectories | Apr 29, 2025 | Multi-Object TrackingObject | —Unverified | 0 |
| Category-Level and Open-Set Object Pose Estimation for Robotics | Apr 28, 2025 | 6D Pose Estimation6D Pose Estimation using RGB | —Unverified | 0 |
| LM-MCVT: A Lightweight Multi-modal Multi-view Convolutional-Vision Transformer Approach for 3D Object Recognition | Apr 27, 2025 | 3D Object RecognitionObject | —Unverified | 0 |
| Dexonomy: Synthesizing All Dexterous Grasp Types in a Grasp Taxonomy | Apr 26, 2025 | AllObject | —Unverified | 0 |
| A Review of 3D Object Detection with Vision-Language Models | Apr 25, 2025 | 3D Object DetectionObject | —Unverified | 0 |
| Multi-Sensor Fusion of Active and Passive Measurements for Extended Object Tracking | Apr 25, 2025 | ObjectObject Tracking | —Unverified | 0 |
| Object Learning and Robust 3D Reconstruction | Apr 22, 2025 | 3D ReconstructionObject | —Unverified | 0 |
| PCF-Grasp: Converting Point Completion to Geometry Feature to Enhance 6-DoF Grasp | Apr 22, 2025 | Object | —Unverified | 0 |
| DeepPD: Joint Phase and Object Estimation from Phase Diversity with Neural Calibration of a Deformable Mirror | Apr 19, 2025 | DiversityObject | —Unverified | 0 |
| Visual Intention Grounding for Egocentric Assistants | Apr 18, 2025 | ObjectVisual Grounding | —Unverified | 0 |
| HMPE:HeatMap Embedding for Efficient Transformer-Based Small Object Detection | Apr 18, 2025 | DecoderFeature Engineering | —Unverified | 0 |
| Few-Shot Referring Video Single- and Multi-Object Segmentation via Cross-Modal Affinity with Instance Sequence Matching | Apr 18, 2025 | ObjectReferring Video Object Segmentation | CodeCode Available | 0 |
| Crossing the Human-Robot Embodiment Gap with Sim-to-Real RL using One Human Demonstration | Apr 17, 2025 | Data AugmentationHuman-Object Interaction Detection | —Unverified | 0 |
| SAR Object Detection with Self-Supervised Pretraining and Curriculum-Aware Sampling | Apr 17, 2025 | Disaster ResponseObject | —Unverified | 0 |
| ViTa-Zero: Zero-shot Visuotactile Object 6D Pose Estimation | Apr 17, 2025 | 6D Pose Estimationhand-object pose | —Unverified | 0 |
| RF-DETR Object Detection vs YOLOv12 : A Study of Transformer-based and CNN-based Architectures for Single-Class and Multi-Class Greenfruit Detection in Complex Orchard Environments Under Label Ambiguity | Apr 17, 2025 | Computational EfficiencyObject | —Unverified | 0 |
| HiScene: Creating Hierarchical 3D Scenes with Isometric View Generation | Apr 17, 2025 | 3D GenerationImage Generation | —Unverified | 0 |
| VLLFL: A Vision-Language Model Based Lightweight Federated Learning Framework for Smart Agriculture | Apr 17, 2025 | Federated LearningLanguage Modeling | —Unverified | 0 |
| Generalized Visual Relation Detection with Diffusion Models | Apr 16, 2025 | Graph GenerationHuman-Object Interaction Detection | —Unverified | 0 |
| A Review of YOLOv12: Attention-Based Enhancements vs. Previous Versions | Apr 16, 2025 | Computational EfficiencyObject | —Unverified | 0 |
| DM-OSVP++: One-Shot View Planning Using 3D Diffusion Models for Active RGB-Based Object Reconstruction | Apr 16, 2025 | ObjectObject Reconstruction | —Unverified | 0 |
| RADLER: Radar Object Detection Leveraging Semantic 3D City Models and Self-Supervised Radar-Image Learning | Apr 16, 2025 | Objectobject-detection | —Unverified | 0 |
| Recent Advance in 3D Object and Scene Generation: A Survey | Apr 16, 2025 | 3D GenerationObject | —Unverified | 0 |
| Object Placement for Anything | Apr 16, 2025 | Object | —Unverified | 0 |
| Safe-Construct: Redefining Construction Safety Violation Recognition as 3D Multi-View Engagement Task | Apr 15, 2025 | 2D Object DetectionObject | —Unverified | 0 |
| Weather-Aware Object Detection Transformer for Domain Adaptation | Apr 15, 2025 | Domain AdaptationObject | —Unverified | 0 |
| 3D Object Reconstruction with mmWave Radars | Apr 15, 2025 | 3D Object Reconstruction3D Shape Reconstruction | —Unverified | 0 |
| COUNTS: Benchmarking Object Detectors and Multimodal Large Language Models under Distribution Shifts | Apr 14, 2025 | BenchmarkingObject | —Unverified | 0 |
| HUMOTO: A 4D Dataset of Mocap Human Object Interactions | Apr 14, 2025 | Human-Object Interaction DetectionMotion Generation | —Unverified | 0 |
| Multi-Object Grounding via Hierarchical Contrastive Siamese Transformers | Apr 14, 2025 | ObjectObject Localization | —Unverified | 0 |
| DiffMOD: Progressive Diffusion Point Denoising for Moving Object Detection in Remote Sensing | Apr 14, 2025 | DenoisingDensity Estimation | —Unverified | 0 |
| MASSeg : 2nd Technical Report for 4th PVUW MOSE Track | Apr 14, 2025 | Data AugmentationObject | CodeCode Available | 0 |
| RICCARDO: Radar Hit Prediction and Convolution for Camera-Radar 3D Object Detection | Apr 12, 2025 | 3D Object DetectionObject | CodeCode Available | 0 |
| Cut-and-Splat: Leveraging Gaussian Splatting for Synthetic Data Generation | Apr 11, 2025 | Depth EstimationInstance Segmentation | CodeCode Available | 0 |
| Training-free Guidance in Text-to-Video Generation via Multimodal Planning and Structured Noise Initialization | Apr 11, 2025 | DenoisingObject | —Unverified | 0 |
| Digital Twin Catalog: A Large-Scale Photorealistic 3D Object Digital Twin Dataset | Apr 11, 2025 | 3D Object Reconstruction3D Reconstruction | —Unverified | 0 |
| WS-DETR: Robust Water Surface Object Detection through Vision-Radar Fusion with Detection Transformer | Apr 10, 2025 | Objectobject-detection | —Unverified | 0 |
| Learning Object Focused Attention | Apr 10, 2025 | Inductive BiasObject | —Unverified | 0 |
| POEM: Precise Object-level Editing via MLLM control | Apr 10, 2025 | Image GenerationObject | —Unverified | 0 |
| SAMJAM: Zero-Shot Video Scene Graph Generation for Egocentric Kitchen Videos | Apr 10, 2025 | Graph GenerationObject | —Unverified | 0 |