| RHINO: Learning Real-Time Humanoid-Human-Object Interaction from Human Demonstrations | Feb 18, 2025 | Human-Object Interaction DetectionObject | —Unverified | 0 |
| ROI-NeRFs: Hi-Fi Visualization of Objects of Interest within a Scene by NeRFs Composition | Feb 18, 2025 | 3D ReconstructionNeRF | —Unverified | 0 |
| Revealing Bias Formation in Deep Neural Networks Through the Geometric Mechanisms of Human Visual Decoupling | Feb 17, 2025 | ObjectObject Recognition | —Unverified | 0 |
| Object-Centric Image to Video Generation with Language Guidance | Feb 17, 2025 | Image to Video GenerationObject | CodeCode Available | 1 |
| Enhancing Transparent Object Pose Estimation: A Fusion of GDR-Net and Edge Detection | Feb 17, 2025 | 6D Pose Estimation using RGBEdge Detection | —Unverified | 0 |
| A Monocular Event-Camera Motion Capture System | Feb 17, 2025 | Object | —Unverified | 0 |
| DA-Mamba: Domain Adaptive Hybrid Mamba-Transformer Based One-Stage Object Detection | Feb 16, 2025 | Domain AdaptationKnowledge Distillation | CodeCode Available | 1 |
| Knowing Your Target: Target-Aware Transformer Makes Better Spatio-Temporal Video Grounding | Feb 16, 2025 | AttributeObject | CodeCode Available | 1 |
| FocalCount: Towards Class-Count Imbalance in Class-Agnostic Counting | Feb 15, 2025 | ObjectObject Counting | —Unverified | 0 |
| Evaluating the Meta- and Object-Level Reasoning of Large Language Models for Question Answering | Feb 14, 2025 | Mathematical ReasoningObject | —Unverified | 0 |
| HIPPo: Harnessing Image-to-3D Priors for Model-free Zero-shot 6D Pose Estimation | Feb 14, 2025 | 3D Reconstruction6D Pose Estimation | —Unverified | 0 |
| Object Detection and Tracking | Feb 14, 2025 | Deep LearningObject | CodeCode Available | 0 |
| Object-Centric Latent Action Learning | Feb 13, 2025 | Imitation LearningObject | —Unverified | 0 |
| Safe Multi-agent Satellite Servicing with Control Barrier Functions | Feb 13, 2025 | ObjectPosition | —Unverified | 0 |
| CineMaster: A 3D-Aware and Controllable Framework for Cinematic Text-to-Video Generation | Feb 12, 2025 | ObjectText-to-Video Generation | —Unverified | 0 |
| Dense Object Detection Based on De-homogenized Queries | Feb 11, 2025 | Dense Object DetectionObject | —Unverified | 0 |
| Articulate That Object Part (ATOP): 3D Part Articulation from Text and Motion Personalization | Feb 11, 2025 | Image GenerationMotion Generation | —Unverified | 0 |
| VidCRAFT3: Camera, Object, and Lighting Control for Image-to-Video Generation | Feb 11, 2025 | Image to Video GenerationObject | —Unverified | 0 |
| PlaySlot: Learning Inverse Latent Dynamics for Controllable Object-Centric Video Prediction and Planning | Feb 11, 2025 | ObjectVideo Prediction | CodeCode Available | 1 |
| SAVE: Self-Attention on Visual Embedding for Zero-Shot Generic Object Counting | Feb 10, 2025 | Exemplar-Free CountingObject | CodeCode Available | 1 |
| Secure Visual Data Processing via Federated Learning | Feb 9, 2025 | Federated LearningManagement | —Unverified | 0 |
| LP-DETR: Layer-wise Progressive Relations for Object Detection | Feb 7, 2025 | DecoderObject | —Unverified | 0 |
| Neural Clustering for Prefractured Mesh Generation in Real-time Object Destruction | Feb 7, 2025 | ClusteringObject | —Unverified | 0 |
| HD-EPIC: A Highly-Detailed Egocentric Video Dataset | Feb 6, 2025 | Action RecognitionNutrition | —Unverified | 0 |
| Advanced Object Detection and Pose Estimation with Hybrid Task Cascade and High-Resolution Networks | Feb 6, 2025 | Autonomous DrivingObject | —Unverified | 0 |
| PartEdit: Fine-Grained Image Editing using Pre-Trained Diffusion Models | Feb 6, 2025 | ObjectText-based Image Editing | —Unverified | 0 |
| AnyPlace: Learning Generalized Object Placement for Robot Manipulation | Feb 6, 2025 | ObjectPose Prediction | —Unverified | 0 |
| UAV Cognitive Semantic Communications Enabled by Knowledge Graph for Robust Object Detection | Feb 6, 2025 | Objectobject-detection | —Unverified | 0 |
| Enhancing people localisation in drone imagery for better crowd management by utilising every pixel in high-resolution images | Feb 6, 2025 | Crowd CountingManagement | —Unverified | 0 |
| Probing a Vision-Language-Action Model for Symbolic States and Integration into a Cognitive Architecture | Feb 6, 2025 | ObjectVision-Language-Action | —Unverified | 0 |
| Disentangling CLIP for Multi-Object Perception | Feb 5, 2025 | DisentanglementImage Classification | —Unverified | 0 |
| ZISVFM: Zero-Shot Object Instance Segmentation in Indoor Robotic Environments with Vision Foundation Models | Feb 5, 2025 | Instance SegmentationObject | CodeCode Available | 0 |
| Rethinking Vision Transformer for Object Centric Foundation Models | Feb 4, 2025 | ObjectObject Tracking | CodeCode Available | 0 |
| Uncertainty Quantification for Collaborative Object Detection Under Adversarial Attacks | Feb 4, 2025 | Adversarial RobustnessAutonomous Driving | —Unverified | 0 |
| Mitigating Object Hallucinations in Large Vision-Language Models via Attention Calibration | Feb 4, 2025 | AttributeHallucination | —Unverified | 0 |
| TUMTraffic-VideoQA: A Benchmark for Unified Spatio-Temporal Video Understanding in Traffic Scenes | Feb 4, 2025 | Autonomous DrivingMultiple-choice | CodeCode Available | 1 |
| Can You Move These Over There? An LLM-based VR Mover for Supporting Object Manipulation | Feb 4, 2025 | Object | —Unverified | 0 |
| Diff9D: Diffusion-Based Domain-Generalized Category-Level 9-DoF Object Pose Estimation | Feb 4, 2025 | DenoisingDomain Generalization | CodeCode Available | 2 |
| Articulate AnyMesh: Open-Vocabulary 3D Articulated Objects Modeling | Feb 4, 2025 | ObjectVisual Prompting | —Unverified | 0 |
| Mitigating Hallucinations in Large Vision-Language Models with Internal Fact-based Contrastive Decoding | Feb 3, 2025 | AttributeMME | —Unverified | 0 |
| Dynamic object goal pushing with mobile manipulators through model-free constrained reinforcement learning | Feb 3, 2025 | FrictionObject | —Unverified | 0 |
| Neural Cellular Automata for Decentralized Sensing using a Soft Inductive Sensor Array for Distributed Manipulator Systems | Feb 3, 2025 | Object | —Unverified | 0 |
| RealRAG: Retrieval-augmented Realistic Image Generation via Self-reflective Contrastive Learning | Feb 2, 2025 | Contrastive LearningImage Generation | —Unverified | 0 |
| SpikingRTNH: Spiking Neural Network for 4D Radar Object Detection | Jan 31, 2025 | 3D Object DetectionAutonomous Driving | —Unverified | 0 |
| Let Human Sketches Help: Empowering Challenging Image Segmentation Task with Freehand Sketches | Jan 31, 2025 | Image SegmentationInteractive Segmentation | —Unverified | 0 |
| RUN: Reversible Unfolding Network for Concealed Object Segmentation | Jan 30, 2025 | ObjectSegmentation | —Unverified | 0 |
| Adaptive Object Detection for Indoor Navigation Assistance: A Performance Evaluation of Real-Time Algorithms | Jan 30, 2025 | Objectobject-detection | —Unverified | 0 |
| Efficient Feature Fusion for UAV Object Detection | Jan 29, 2025 | Objectobject-detection | CodeCode Available | 0 |
| Efficient Interactive 3D Multi-Object Removal | Jan 29, 2025 | ObjectScene Understanding | —Unverified | 0 |
| DINOSTAR: Deep Iterative Neural Object Detector Self-Supervised Training for Roadside LiDAR Applications | Jan 28, 2025 | Objectobject-detection | —Unverified | 0 |