| 4D-Bench: Benchmarking Multi-modal Large Language Models for 4D Object Understanding | Mar 22, 2025 | BenchmarkingObject | CodeCode Available | 0 |
| GOAL: Global-local Object Alignment Learning | Mar 22, 2025 | DescriptiveObject | CodeCode Available | 1 |
| MAMAT: 3D Mamba-Based Atmospheric Turbulence Removal and its Object Detection Capability | Mar 22, 2025 | MambaObject | —Unverified | 0 |
| RefCut: Interactive Segmentation with Reference Guidance | Mar 22, 2025 | Interactive SegmentationObject | —Unverified | 0 |
| Co-op: Correspondence-based Novel Object Pose Estimation | Mar 22, 2025 | ObjectPose Estimation | —Unverified | 0 |
| Re-HOLD: Video Hand Object Interaction Reenactment via adaptive Layout-instructed Diffusion Model | Mar 21, 2025 | DisentanglementHuman-Object Interaction Detection | —Unverified | 0 |
| ExCap3D: Expressive 3D Scene Understanding via Object Captioning with Varying Detail | Mar 21, 2025 | ObjectScene Understanding | —Unverified | 0 |
| Which2comm: An Efficient Collaborative Perception Framework for 3D Object Detection | Mar 21, 2025 | 3D Object DetectionObject | —Unverified | 0 |
| GraPLUS: Graph-based Placement Using Semantics for Image Composition | Mar 20, 2025 | Object | —Unverified | 0 |
| MagicMotion: Controllable Video Generation with Dense-to-Sparse Trajectory Guidance | Mar 20, 2025 | Image to Video GenerationObject | —Unverified | 0 |
| Fine-Grained Open-Vocabulary Object Detection with Fined-Grained Prompts: Task, Dataset and Benchmark | Mar 19, 2025 | Objectobject-detection | —Unverified | 0 |
| xMOD: Cross-Modal Distillation for 2D/3D Multi-Object Discovery from 2D motion | Mar 19, 2025 | Multi-object discoveryObject | CodeCode Available | 0 |
| Variational Message Passing-based Multiobject Tracking for MIMO-Radars using Raw Sensor Signals | Mar 19, 2025 | ObjectSuper-Resolution | —Unverified | 0 |
| UltraFlwr -- An Efficient Federated Medical and Surgical Object Detection Framework | Mar 19, 2025 | Federated LearningObject | CodeCode Available | 1 |
| Test-Time Backdoor Detection for Object Detection Models | Mar 19, 2025 | image-classificationImage Classification | —Unverified | 0 |
| GIVEPose: Gradual Intra-class Variation Elimination for RGB-based Category-Level Object Pose Estimation | Mar 19, 2025 | ObjectPose Estimation | CodeCode Available | 1 |
| Volumetric Reconstruction From Partial Views for Task-Oriented Grasping | Mar 19, 2025 | Generative Adversarial NetworkObject | —Unverified | 0 |
| Intelligent Spatial Perception by Building Hierarchical 3D Scene Graphs for Indoor Scenarios with the Help of LLMs | Mar 19, 2025 | ObjectRobot Navigation | —Unverified | 0 |
| Robust Object Detection of Underwater Robot based on Domain Generalization | Mar 18, 2025 | Domain GeneralizationObject | CodeCode Available | 1 |
| HSOD-BIT-V2: A New Challenging Benchmarkfor Hyperspectral Salient Object Detection | Mar 18, 2025 | Objectobject-detection | CodeCode Available | 0 |
| FrustumFusionNets: A Three-Dimensional Object Detection Network Based on Tractor Road Scene | Mar 18, 2025 | Objectobject-detection | —Unverified | 0 |
| Learning Shape-Independent Transformation via Spherical Representations for Category-Level Object Pose Estimation | Mar 18, 2025 | ObjectPose Estimation | —Unverified | 0 |
| PSA-SSL: Pose and Size-aware Self-Supervised Learning on LiDAR Point Clouds | Mar 18, 2025 | 3D Object Detection3D Semantic Segmentation | CodeCode Available | 0 |
| MMR: A Large-scale Benchmark Dataset for Multi-target and Multi-granularity Reasoning Segmentation | Mar 18, 2025 | ObjectReasoning Segmentation | CodeCode Available | 1 |
| Rethinking End-to-End 2D to 3D Scene Segmentation in Gaussian Splatting | Mar 18, 2025 | Instance SegmentationObject | CodeCode Available | 2 |
| LED: LLM Enhanced Open-Vocabulary Object Detection without Human Curated Data Generation | Mar 18, 2025 | DecoderObject | CodeCode Available | 0 |
| FLEX: A Framework for Learning Robot-Agnostic Force-based Skills Involving Sustained Contact Object Manipulation | Mar 17, 2025 | Imitation LearningObject | —Unverified | 0 |
| History-Aware Transformation of ReID Features for Multiple Object Tracking | Mar 16, 2025 | Multi-Object TrackingMultiple Object Tracking | CodeCode Available | 1 |
| Leveraging Motion Information for Better Self-Supervised Video Correspondence Learning | Mar 15, 2025 | ObjectSemantic Segmentation | —Unverified | 0 |
| Cognitive Disentanglement for Referring Multi-Object Tracking | Mar 14, 2025 | DisentanglementMulti-Object Tracking | —Unverified | 0 |
| MTV-Inpaint: Multi-Task Long Video Inpainting | Mar 14, 2025 | Image InpaintingObject | —Unverified | 0 |
| TASTE-Rob: Advancing Video Generation of Task-Oriented Hand-Object Interaction for Generalizable Robotic Manipulation | Mar 14, 2025 | Imitation LearningObject | —Unverified | 0 |
| Disentangled Object-Centric Image Representation for Robotic Manipulation | Mar 14, 2025 | Object | —Unverified | 0 |
| MoEdit: On Learning Quantity Perception for Multi-object Image Editing | Mar 13, 2025 | AttributeImage Generation | CodeCode Available | 0 |
| 3D Extended Object Tracking based on Extruded B-Spline Side View Profiles | Mar 13, 2025 | ObjectObject Tracking | —Unverified | 0 |
| OmniSTVG: Toward Spatio-Temporal Omni-Object Video Grounding | Mar 13, 2025 | ObjectVideo Grounding | CodeCode Available | 1 |
| 4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models | Mar 13, 2025 | Large Language ModelObject | CodeCode Available | 2 |
| Semantic-Supervised Spatial-Temporal Fusion for LiDAR-based 3D Object Detection | Mar 13, 2025 | 3D Object DetectionObject | —Unverified | 0 |
| OCPM^2: Extending the Process Mining Methodology for Object-Centric Event Data Extraction | Mar 13, 2025 | ManagementObject | —Unverified | 0 |
| Auto-Associative Memories for Direct Signalling of Visual Angle During Object Approaches | Mar 13, 2025 | Object | —Unverified | 0 |
| DreamInsert: Zero-Shot Image-to-Video Object Insertion from A Single Image | Mar 13, 2025 | Object | —Unverified | 0 |
| ROODI: Reconstructing Occluded Objects with Denoising Inpainters | Mar 13, 2025 | DenoisingObject | —Unverified | 0 |
| 6D Object Pose Tracking in Internet Videos for Robotic Manipulation | Mar 13, 2025 | 6D Pose EstimationObject | —Unverified | 0 |
| KUDA: Keypoints to Unify Dynamics Learning and Visual Prompting for Open-Vocabulary Robotic Manipulation | Mar 13, 2025 | ObjectVisual Prompting | —Unverified | 0 |
| Object-Aware DINO (Oh-A-Dino): Enhancing Self-Supervised Representations for Multi-Object Instance Retrieval | Mar 12, 2025 | ObjectRetrieval | —Unverified | 0 |
| GASPACHO: Gaussian Splatting for Controllable Humans and Objects | Mar 12, 2025 | Human-Object Interaction DetectionObject | —Unverified | 0 |
| InteractEdit: Zero-Shot Editing of Human-Object Interactions in Images | Mar 12, 2025 | AttributeHuman-Object Interaction Detection | —Unverified | 0 |
| 2HandedAfforder: Learning Precise Actionable Bimanual Affordances from Human Videos | Mar 12, 2025 | Object | —Unverified | 0 |
| TetraGrip: Sensor-Driven Multi-Suction Reactive Object Manipulation in Cluttered Scenes | Mar 12, 2025 | Object | —Unverified | 0 |
| Bring Remote Sensing Object Detect Into Nature Language Model: Using SFT Method | Mar 11, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |