| Bring Remote Sensing Object Detect Into Nature Language Model: Using SFT Method | Mar 11, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Hierarchical Contact-Rich Trajectory Optimization for Multi-Modal Manipulation using Tight Convex Relaxations | Mar 11, 2025 | Contact-rich ManipulationObject | —Unverified | 0 |
| OmniPaint: Mastering Object-Oriented Editing via Disentangled Insertion-Removal Inpainting | Mar 11, 2025 | HallucinationObject | —Unverified | 0 |
| Embodied Crowd Counting | Mar 11, 2025 | Crowd CountingObject | —Unverified | 0 |
| Large Language Model Guided Progressive Feature Alignment for Multimodal UAV Object Detection | Mar 10, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Recovering Partially Corrupted Major Objects through Tri-modality Based Image Completion | Mar 10, 2025 | ObjectSpecificity | —Unverified | 0 |
| Large model enhanced computational ghost imaging | Mar 10, 2025 | Image Reconstructionmodel | CodeCode Available | 0 |
| Erase Diffusion: Empowering Object Removal Through Calibrating Diffusion Pathways | Mar 10, 2025 | Image InpaintingObject | —Unverified | 0 |
| Multi-Modal 3D Mesh Reconstruction from Images and Text | Mar 10, 2025 | 3D Object Reconstruction3D Reconstruction | —Unverified | 0 |
| Find your Needle: Small Object Image Retrieval via Multi-Object Attention Optimization | Mar 10, 2025 | Image RetrievalObject | —Unverified | 0 |
| Hierarchical Cross-Modal Alignment for Open-Vocabulary 3D Object Detection | Mar 10, 2025 | 3D Object Detectioncross-modal alignment | —Unverified | 0 |
| EAZY: Eliminating Hallucinations in LVLMs by Zeroing out Hallucinatory Image Tokens | Mar 10, 2025 | HallucinationLanguage Modeling | —Unverified | 0 |
| Aligning Instance-Semantic Sparse Representation towards Unsupervised Object Segmentation and Shape Abstraction with Repeatable Primitives | Mar 10, 2025 | Instance SegmentationObject | —Unverified | 0 |
| A Light Perspective for 3D Object Detection | Mar 10, 2025 | 3D Object DetectionObject | —Unverified | 0 |
| OV-SCAN: Semantically Consistent Alignment for Novel Object Discovery in Open-Vocabulary 3D Object Detection | Mar 9, 2025 | 3D Object DetectionAutonomous Driving | —Unverified | 0 |
| AxisPose: Model-Free Matching-Free Single-Shot 6D Object Pose Estimation via Axis Generation | Mar 9, 2025 | 3D Feature Matching6D Pose Estimation | —Unverified | 0 |
| D3DR: Lighting-Aware Object Insertion in Gaussian Splatting | Mar 9, 2025 | 3DGSDenoising | —Unverified | 0 |
| OpenRSD: Towards Open-prompts for Object Detection in Remote Sensing Images | Mar 8, 2025 | Objectobject-detection | —Unverified | 0 |
| Object-Centric World Model for Language-Guided Manipulation | Mar 8, 2025 | Autonomous Drivingmodel | —Unverified | 0 |
| Accurate and Efficient Two-Stage Gun Detection in Video | Mar 8, 2025 | Anomaly DetectionObject | —Unverified | 0 |
| OSCAR: Object Status and Contextual Awareness for Recipes to Support Non-Visual Cooking | Mar 7, 2025 | Object | —Unverified | 0 |
| 2D Object Detection: A Survey | Mar 7, 2025 | 2D Object DetectionObject | —Unverified | 0 |
| DecoupledGaussian: Object-Scene Decoupling for Physics-Based Interaction | Mar 7, 2025 | Autonomous DrivingObject | —Unverified | 0 |
| Teach YOLO to Remember: A Self-Distillation Approach for Continual Object Detection | Mar 6, 2025 | class-incremental learningClass Incremental Learning | —Unverified | 0 |
| Energy-Guided Optimization for Personalized Image Editing with Pretrained Text-to-Image Diffusion Models | Mar 6, 2025 | Image GenerationObject | —Unverified | 0 |
| High-Precision Transformer-Based Visual Servoing for Humanoid Robots in Aligning Tiny Objects | Mar 6, 2025 | Object | —Unverified | 0 |
| ReynoldsFlow: Exquisite Flow Estimation via Reynolds Transport Theorem | Mar 6, 2025 | Motion EstimationObject | CodeCode Available | 0 |
| Learning Object Placement Programs for Indoor Scene Synthesis with Iterative Self Training | Mar 6, 2025 | Indoor Scene SynthesisObject | —Unverified | 0 |
| Shaken, Not Stirred: A Novel Dataset for Visual Understanding of Glasses in Human-Robot Bartending Tasks | Mar 6, 2025 | Objectobject-detection | —Unverified | 0 |
| Fine-Tuning Florence2 for Enhanced Object Detection in Un-constructed Environments: Vision-Language Model Approach | Mar 6, 2025 | GPULanguage Modeling | —Unverified | 0 |
| Afford-X: Generalizable and Slim Affordance Reasoning for Task-oriented Manipulation | Mar 5, 2025 | ObjectObject Recognition | —Unverified | 0 |
| L2RDaS: Synthesizing 4D Radar Tensors for Model Generalization via Dataset Expansion | Mar 5, 2025 | Autonomous DrivingObject | —Unverified | 0 |
| Simulation-Based Performance Evaluation of 3D Object Detection Methods with Deep Learning for a LiDAR Point Cloud Dataset in a SOTIF-related Use Case | Mar 5, 2025 | 3D Object DetectionObject | CodeCode Available | 0 |
| Active 6D Pose Estimation for Textureless Objects using Multi-View RGB Frames | Mar 5, 2025 | 6D Pose EstimationObject | —Unverified | 0 |
| Towards Visual Discrimination and Reasoning of Real-World Physical Dynamics: Physics-Grounded Anomaly Detection | Mar 5, 2025 | Anomaly DetectionObject | —Unverified | 0 |
| BEVMOSNet: Multimodal Fusion for BEV Moving Object Segmentation | Mar 5, 2025 | Autonomous VehiclesMotion Segmentation | —Unverified | 0 |
| A dataset-free approach for self-supervised learning of 3D reflectional symmetries | Mar 4, 2025 | ObjectSelf-Supervised Learning | —Unverified | 0 |
| MonoLite3D: Lightweight 3D Object Properties Estimation | Mar 4, 2025 | Object | —Unverified | 0 |
| ClipGrader: Leveraging Vision-Language Models for Robust Label Quality Assessment in Object Detection | Mar 3, 2025 | Objectobject-detection | —Unverified | 0 |
| Category-level Meta-learned NeRF Priors for Efficient Object Mapping | Mar 3, 2025 | GPUMeta-Learning | —Unverified | 0 |
| VideoHandles: Editing 3D Object Compositions in Videos Using Video Generative Priors | Mar 3, 2025 | 3D ReconstructionObject | —Unverified | 0 |
| Object-Aware Video Matting with Cross-Frame Guidance | Mar 3, 2025 | Image MattingObject | —Unverified | 0 |
| Language-Guided Object Search in Agricultural Environments | Mar 3, 2025 | Large Language ModelObject | —Unverified | 0 |
| AirRoom: Objects Matter in Room Reidentification | Mar 3, 2025 | ObjectSemantic Segmentation | —Unverified | 0 |
| AI-Driven Relocation Tracking in Dynamic Kitchen Environments | Mar 3, 2025 | 2D Object Detection3D Reconstruction | CodeCode Available | 0 |
| EigenActor: Variant Body-Object Interaction Generation Evolved from Invariant Action Basis Reasoning | Mar 1, 2025 | Human-Object Interaction DetectionObject | —Unverified | 0 |
| Taming Large Multimodal Agents for Ultra-low Bitrate Semantically Disentangled Image Compression | Mar 1, 2025 | DecoderImage Compression | CodeCode Available | 0 |
| Enhancing deep neural networks through complex-valued representations and Kuramoto synchronization dynamics | Feb 28, 2025 | Object | —Unverified | 0 |
| Towards Semantic 3D Hand-Object Interaction Generation via Functional Text Guidance | Feb 28, 2025 | Object | —Unverified | 0 |
| QORT-Former: Query-optimized Real-time Transformer for Understanding Two Hands Manipulating Objects | Feb 27, 2025 | 3D Pose EstimationAction Recognition | —Unverified | 0 |