| Detect Everything with Few Examples | Sep 22, 2023 | Binary ClassificationCross-Domain Few-Shot Object Detection | CodeCode Available | 2 |
| PointLLM: Empowering Large Language Models to Understand Point Clouds | Aug 31, 2023 | 3D Object Captioning3D Object Classification | CodeCode Available | 2 |
| InterDiff: Generating 3D Human-Object Interactions with Physics-Informed Diffusion | Aug 31, 2023 | 3D Human DynamicsHuman Dynamics | CodeCode Available | 2 |
| DiffusionTrack: Diffusion Model For Multi-Object Tracking | Aug 19, 2023 | Denoisingmodel | CodeCode Available | 2 |
| SparseBEV: High-Performance Sparse 3D Object Detection from Multi-Camera Videos | Aug 18, 2023 | 3D Object DetectionObject | CodeCode Available | 2 |
| MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions | Aug 16, 2023 | Motion Expressions Guided Video SegmentationObject | CodeCode Available | 2 |
| YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection | Aug 10, 2023 | Objectobject-detection | CodeCode Available | 2 |
| FocalFormer3D : Focusing on Hard Instance for 3D Object Detection | Aug 8, 2023 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 |
| RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control | Jul 28, 2023 | ObjectQuestion Answering | CodeCode Available | 2 |
| MeMOTR: Long-Term Memory-Augmented Transformer for Multi-Object Tracking | Jul 28, 2023 | Multi-Object TrackingMultiple Object Tracking | CodeCode Available | 2 |
| Tracking Anything in High Quality | Jul 26, 2023 | ObjectObject Tracking | CodeCode Available | 2 |
| COCO-O: A Benchmark for Object Detectors under Natural Distribution Shifts | Jul 24, 2023 | Autonomous DrivingObject | CodeCode Available | 2 |
| CNOS: A Strong Baseline for CAD-based Novel Object Segmentation | Jul 20, 2023 | ObjectSemantic Segmentation | CodeCode Available | 2 |
| DragonDiffusion: Enabling Drag-style Manipulation on Diffusion Models | Jul 5, 2023 | Object | CodeCode Available | 2 |
| RVT: Robotic View Transformer for 3D Object Manipulation | Jun 26, 2023 | ObjectRobot Manipulation | CodeCode Available | 2 |
| OpenMask3D: Open-Vocabulary 3D Instance Segmentation | Jun 23, 2023 | 3D Instance Segmentation3D Open-Vocabulary Instance Segmentation | CodeCode Available | 2 |
| SoftGPT: Learn Goal-oriented Soft Object Manipulation Skills by Generative Pre-trained Heterogeneous Graph Transformer | Jun 22, 2023 | Object | CodeCode Available | 2 |
| Aria Digital Twin: A New Benchmark Dataset for Egocentric 3D Machine Perception | Jun 10, 2023 | 3D Object DetectionBenchmarking | CodeCode Available | 2 |
| DetZero: Rethinking Offboard 3D Object Detection with Long-term Sequential Point Clouds | Jun 9, 2023 | 3D Multi-Object Tracking3D Object Detection | CodeCode Available | 2 |
| SAM3D: Zero-Shot 3D Object Detection via Segment Anything Model | Jun 4, 2023 | 3D Object DetectionImage Segmentation | CodeCode Available | 2 |
| Multi-modal Queried Object Detection in the Wild | May 30, 2023 | Few-Shot Object DetectionObject | CodeCode Available | 2 |
| Contextual Object Detection with Multimodal Large Language Models | May 29, 2023 | Cloze TestDecoder | CodeCode Available | 2 |
| NeRO: Neural Geometry and BRDF Reconstruction of Reflective Objects from Multiview Images | May 27, 2023 | Neural RenderingObject | CodeCode Available | 2 |
| DetGPT: Detect What You Need via Reasoning | May 23, 2023 | Autonomous DrivingObject | CodeCode Available | 2 |
| Going Denser with Open-Vocabulary Part Segmentation | May 18, 2023 | Objectobject-detection | CodeCode Available | 2 |
| Evaluating Object Hallucination in Large Vision-Language Models | May 17, 2023 | HallucinationObject | CodeCode Available | 2 |
| Video Object Segmentation in Panoptic Wild Scenes | May 8, 2023 | ObjectSemantic Segmentation | CodeCode Available | 2 |
| PillarNeXt: Rethinking Network Designs for 3D Object Detection in LiDAR Point Clouds | May 8, 2023 | 2D Object Detection3D Object Detection | CodeCode Available | 2 |
| SAMRS: Scaling-up Remote Sensing Segmentation Dataset with Segment Anything Model | May 3, 2023 | Instance SegmentationObject | CodeCode Available | 2 |
| SportsMOT: A Large Multi-Object Tracking Dataset in Multiple Sports Scenes | Apr 11, 2023 | Multi-Object TrackingMultiple Object Tracking | CodeCode Available | 2 |
| EA-LSS: Edge-aware Lift-splat-shot Framework for 3D BEV Object Detection | Mar 31, 2023 | 3D Object DetectionDepth Estimation | CodeCode Available | 2 |
| LayoutDiffusion: Controllable Diffusion Model for Layout-to-image Generation | Mar 30, 2023 | Image GenerationLayout-to-Image Generation | CodeCode Available | 2 |
| PAIR-Diffusion: A Comprehensive Multimodal Object-Level Image Editor | Mar 30, 2023 | Object | CodeCode Available | 2 |
| NOPE: Novel Object Pose Estimation from a Single Image | Mar 23, 2023 | ObjectPose Estimation | CodeCode Available | 2 |
| Dense Distinct Query for End-to-End Object Detection | Mar 22, 2023 | Objectobject-detection | CodeCode Available | 2 |
| Exploring Object-Centric Temporal Modeling for Efficient Multi-View 3D Object Detection | Mar 21, 2023 | 3D Multi-Object Tracking3D Object Detection | CodeCode Available | 2 |
| VoxelNeXt: Fully Sparse VoxelNet for 3D Object Detection and Tracking | Mar 20, 2023 | 3D Object DetectionObject | CodeCode Available | 2 |
| Large Selective Kernel Network for Remote Sensing Object Detection | Mar 16, 2023 | Objectobject-detection | CodeCode Available | 2 |
| InstMove: Instance Motion for Object-centric Video Segmentation | Mar 14, 2023 | ObjectOptical Flow Estimation | CodeCode Available | 2 |
| Virtual Sparse Convolution for Multimodal 3D Object Detection | Mar 4, 2023 | 3D Multi-Object Tracking3D Object Detection | CodeCode Available | 2 |
| Fusing Visual Appearance and Geometry for Multi-modality 6DoF Object Tracking | Feb 22, 2023 | 3D Object Tracking6D Pose Estimation | CodeCode Available | 2 |
| Efficient Teacher: Semi-Supervised Object Detection for YOLOv5 | Feb 15, 2023 | Objectobject-detection | CodeCode Available | 2 |
| EdgeYOLO: An Edge-Real-Time Object Detector | Feb 15, 2023 | Data AugmentationEdge-computing | CodeCode Available | 2 |
| Standing Between Past and Future: Spatio-Temporal Modeling for Multi-Camera 3D Multi-Object Tracking | Feb 7, 2023 | 3D Multi-Object TrackingMulti-Object Tracking | CodeCode Available | 2 |
| MOSE: A New Dataset for Video Object Segmentation in Complex Scenes | Feb 3, 2023 | ObjectSegmentation | CodeCode Available | 2 |
| vMAP: Vectorised Object Mapping for Neural Field SLAM | Feb 3, 2023 | Object | CodeCode Available | 2 |
| OmniObject3D: Large-Vocabulary 3D Object Dataset for Realistic Perception, Reconstruction and Generation | Jan 18, 2023 | Novel View SynthesisObject | CodeCode Available | 2 |
| PACO: Parts and Attributes of Common Objects | Jan 4, 2023 | 2D Object DetectionAttribute | CodeCode Available | 2 |
| FocalFormer3D: Focusing on Hard Instance for 3D Object Detection | Jan 1, 2023 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 |
| Autoregressive Visual Tracking | Jan 1, 2023 | ObjectObject Tracking | CodeCode Available | 2 |