| Articulated Object Interaction in Unknown Scenes with Whole-Body Mobile Manipulation | Mar 18, 2021 | Object | CodeCode Available | 2 |
| ARCTIC: A Dataset for Dexterous Bimanual Hand-Object Manipulation | Apr 28, 2022 | 3D ReconstructionObject | CodeCode Available | 2 |
| DetGPT: Detect What You Need via Reasoning | May 23, 2023 | Autonomous DrivingObject | CodeCode Available | 2 |
| FocalFormer3D : Focusing on Hard Instance for 3D Object Detection | Aug 8, 2023 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 |
| Focal Loss for Dense Object Detection | Aug 7, 2017 | 2D Object DetectionDense Object Detection | CodeCode Available | 2 |
| Focal Sparse Convolutional Networks for 3D Object Detection | Apr 26, 2022 | 3D Object DetectionObject | CodeCode Available | 2 |
| Fusing Visual Appearance and Geometry for Multi-modality 6DoF Object Tracking | Feb 22, 2023 | 3D Object Tracking6D Pose Estimation | CodeCode Available | 2 |
| FusionVision: A comprehensive approach of 3D object reconstruction and segmentation from RGB-D cameras using YOLO and fast segment anything | Feb 29, 2024 | 3D Object ReconstructionInstance Segmentation | CodeCode Available | 2 |
| Aligning and Prompting Everything All at Once for Universal Visual Perception | Dec 4, 2023 | AllObject | CodeCode Available | 2 |
| Detect Everything with Few Examples | Sep 22, 2023 | Binary ClassificationCross-Domain Few-Shot Object Detection | CodeCode Available | 2 |
| DiffusionTrack: Diffusion Model For Multi-Object Tracking | Aug 19, 2023 | Denoisingmodel | CodeCode Available | 2 |
| Generative Region-Language Pretraining for Open-Ended Object Detection | Mar 15, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| EAGLE: Eigen Aggregation Learning for Object-Centric Unsupervised Semantic Segmentation | Mar 3, 2024 | ObjectRepresentation Learning | CodeCode Available | 2 |
| Deep Snake for Real-Time Instance Segmentation | Jan 6, 2020 | GPUInstance Segmentation | CodeCode Available | 2 |
| AGLA: Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention | Jun 18, 2024 | ObjectResponse Generation | CodeCode Available | 2 |
| Going Denser with Open-Vocabulary Part Segmentation | May 18, 2023 | Objectobject-detection | CodeCode Available | 2 |
| Grasp, See, and Place: Efficient Unknown Object Rearrangement with Policy Structure Prior | Feb 23, 2024 | ObjectObject Rearrangement | CodeCode Available | 2 |
| GRiT: A Generative Region-to-text Transformer for Object Understanding | Dec 1, 2022 | DecoderDense Captioning | CodeCode Available | 2 |
| ALBench: A Framework for Evaluating Active Learning in Object Detection | Jul 27, 2022 | Active Learningimage-classification | CodeCode Available | 2 |
| ImageDream: Image-Prompt Multi-view Diffusion for 3D Generation | Dec 2, 2023 | 3D GenerationObject | CodeCode Available | 2 |
| In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic Segmentation | Aug 9, 2024 | Image to textObject | CodeCode Available | 2 |
| In-Hand Object Rotation via Rapid Motor Adaptation | Oct 10, 2022 | ObjectReinforcement Learning (RL) | CodeCode Available | 2 |
| InstructSAM: A Training-Free Framework for Instruction-Oriented Remote Sensing Object Recognition | May 21, 2025 | Earth ObservationObject | CodeCode Available | 2 |
| InteractVLM: 3D Interaction Reasoning from 2D Foundational Models | Apr 7, 2025 | 3D ReconstructionObject | CodeCode Available | 2 |
| DeepInteraction: 3D Object Detection via Modality Interaction | Aug 23, 2022 | 3D Object DetectionDecoder | CodeCode Available | 2 |
| InterFusion: Text-Driven Generation of 3D Human-Object Interaction | Mar 22, 2024 | 3D Generationglobal-optimization | CodeCode Available | 2 |
| Joint Reconstruction of 3D Human and Object via Contact-Based Refinement Transformer | Apr 7, 2024 | 3D Human Reconstruction3D Object Reconstruction | CodeCode Available | 2 |
| Autoregressive Visual Tracking | Jan 1, 2023 | ObjectObject Tracking | CodeCode Available | 2 |
| K-Radar: 4D Radar Object Detection for Autonomous Driving in Various Weather Conditions | Jun 16, 2022 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 |
| DeMo: Decoupled Feature-Based Mixture of Experts for Multi-Modal Object Re-Identification | Dec 14, 2024 | Mixture-of-ExpertsObject | CodeCode Available | 2 |
| LayoutDiffusion: Controllable Diffusion Model for Layout-to-image Generation | Mar 30, 2023 | Image GenerationLayout-to-Image Generation | CodeCode Available | 2 |
| Learning Embeddings with Centroid Triplet Loss for Object Identification in Robotic Grasping | Apr 9, 2024 | Image RetrievalObject | CodeCode Available | 2 |
| LeviTor: 3D Trajectory Oriented Image-to-Video Synthesis | Dec 19, 2024 | Object | CodeCode Available | 2 |
| LeYOLO, New Scalable and Efficient CNN Architecture for Object Detection | Jun 20, 2024 | Computational EfficiencyObject | CodeCode Available | 2 |
| DAVE -- A Detect-and-Verify Paradigm for Low-Shot Counting | Apr 25, 2024 | Exemplar-Free CountingFew-shot Object Counting and Detection | CodeCode Available | 2 |
| DAMSDet: Dynamic Adaptive Multispectral Detection Transformer with Competitive Query Selection and Adaptive Feature Fusion | Mar 1, 2024 | Objectobject-detection | CodeCode Available | 2 |
| Make It Count: Text-to-Image Generation with an Accurate Number of Objects | Jun 14, 2024 | DenoisingImage Generation | CodeCode Available | 2 |
| MambaMOS: LiDAR-based 3D Moving Object Segmentation with Motion-aware State Space Model | Apr 19, 2024 | ObjectSemantic Segmentation | CodeCode Available | 2 |
| Decoupling Features in Hierarchical Propagation for Video Object Segmentation | Oct 18, 2022 | ObjectSemantic Segmentation | CodeCode Available | 2 |
| MegaPose: 6D Pose Estimation of Novel Objects via Render & Compare | Dec 13, 2022 | 3D Object Detection6D Pose Estimation | CodeCode Available | 2 |
| AdaMixer: A Fast-Converging Query-Based Object Detector | Mar 30, 2022 | ObjectObject Detection | CodeCode Available | 2 |
| Meta-DETR: Image-Level Few-Shot Detection with Inter-Class Correlation Exploitation | Jul 30, 2022 | Few-Shot Object DetectionMeta-Learning | CodeCode Available | 2 |
| Cross-View Referring Multi-Object Tracking | Dec 23, 2024 | Cross-view Referring Multi-Object TrackingMulti-Object Tracking | CodeCode Available | 2 |
| Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding | Nov 28, 2023 | HallucinationObject | CodeCode Available | 2 |
| MonoCD: Monocular 3D Object Detection with Complementary Depths | Apr 4, 2024 | 3D Object DetectionDepth Estimation | CodeCode Available | 2 |
| Monocular 3D Object Detection with Depth from Motion | Jul 26, 2022 | 3D Object DetectionDepth Estimation | CodeCode Available | 2 |
| DeepFusionMOT: A 3D Multi-Object Tracking Framework Based on Camera-LiDAR Fusion with Deep Association | Feb 24, 2022 | 3D Multi-Object TrackingMulti-Object Tracking | CodeCode Available | 2 |
| Beyond MOT: Semantic Multi-Object Tracking | Mar 8, 2024 | Multi-Object TrackingObject | CodeCode Available | 2 |
| Dense Distinct Query for End-to-End Object Detection | Mar 22, 2023 | Objectobject-detection | CodeCode Available | 2 |
| CPA-Enhancer: Chain-of-Thought Prompted Adaptive Enhancer for Object Detection under Unknown Degradations | Mar 17, 2024 | Objectobject-detection | CodeCode Available | 2 |