| YOLOv10: Real-Time End-to-End Object Detection | May 23, 2024 | 2D Object DetectionData Augmentation | CodeCode Available | 11 | 5 |
| TripoSR: Fast 3D Object Reconstruction from a Single Image | Mar 4, 2024 | 3D Generation3D Object Reconstruction | CodeCode Available | 9 | 5 |
| YOLO-World: Real-Time Open-Vocabulary Object Detection | Jan 30, 2024 | Instance SegmentationLanguage Modeling | CodeCode Available | 9 | 5 |
| DETRs Beat YOLOs on Real-time Object Detection | Apr 17, 2023 | 2D Object DetectionDecoder | CodeCode Available | 8 | 5 |
| Visual-RFT: Visual Reinforcement Fine-Tuning | Mar 3, 2025 | Few-Shot Object DetectionFine-Grained Image Classification | CodeCode Available | 7 | 5 |
| Efficient Track Anything | Nov 28, 2024 | ObjectSegmentation | CodeCode Available | 7 | 5 |
| T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy | Mar 21, 2024 | Contrastive LearningDescriptive | CodeCode Available | 7 | 5 |
| YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors | Jul 6, 2022 | 2D Object DetectionGPU | CodeCode Available | 7 | 5 |
| YOLOv12: Attention-Centric Real-Time Object Detectors | Feb 18, 2025 | GPUObject | CodeCode Available | 7 | 5 |
| DragAnything: Motion Control for Anything using Entity Representation | Mar 12, 2024 | ObjectVideo Generation | CodeCode Available | 7 | 5 |
| Underwater Camouflaged Object Tracking Meets Vision-Language SAM2 | Sep 25, 2024 | ObjectObject Tracking | CodeCode Available | 5 | 5 |
| UnCommon Objects in 3D | Jan 13, 2025 | Object | CodeCode Available | 5 | 5 |
| Matching Anything by Segmenting Anything | Jun 6, 2024 | Domain GeneralizationMultiple Object Tracking | CodeCode Available | 5 | 5 |
| Slicing Aided Hyper Inference and Fine-tuning for Small Object Detection | Feb 14, 2022 | Objectobject-detection | CodeCode Available | 5 | 5 |
| GaussianObject: High-Quality 3D Object Reconstruction from Four Views with Gaussian Splatting | Feb 15, 2024 | 3D Object ReconstructionNeural Rendering | CodeCode Available | 5 | 5 |
| Awesome Multi-modal Object Tracking | May 23, 2024 | Autonomous DrivingKnowledge Distillation | CodeCode Available | 5 | 5 |
| RealFusion: 360° Reconstruction of Any Object from a Single Image | Feb 21, 2023 | 3D ReconstructionObject | CodeCode Available | 5 | 5 |
| DINO-X: A Unified Vision Model for Open-World Object Detection and Understanding | Nov 21, 2024 | Long-tailed Object DetectionObject | CodeCode Available | 5 | 5 |
| Street Gaussians: Modeling Dynamic Urban Scenes with Gaussian Splatting | Jan 2, 2024 | Autonomous DrivingNeRF | CodeCode Available | 5 | 5 |
| Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos | Jan 7, 2025 | 2kLanguage Modeling | CodeCode Available | 5 | 5 |
| PAIR Diffusion: A Comprehensive Multimodal Object-Level Image Editor | Jan 1, 2024 | Object | CodeCode Available | 4 | 5 |
| DiffusionDet: Diffusion Model for Object Detection | Nov 17, 2022 | Denoisingmodel | CodeCode Available | 4 | 5 |
| Mamba YOLO: A Simple Baseline for Object Detection with State Space Model | Jun 9, 2024 | GPUMamba | CodeCode Available | 4 | 5 |
| Efficient Part-level 3D Object Generation via Dual Volume Packing | Jun 11, 2025 | DiversityObject | CodeCode Available | 4 | 5 |
| GCoNet+: A Stronger Group Collaborative Co-Salient Object Detector | May 30, 2022 | Co-Salient Object DetectionObject | CodeCode Available | 4 | 5 |
| Transformer for Object Re-Identification: A Survey | Jan 13, 2024 | ObjectSurvey | CodeCode Available | 4 | 5 |
| Strip R-CNN: Large Strip Convolution for Remote Sensing Object Detection | Jan 7, 2025 | Objectobject-detection | CodeCode Available | 4 | 5 |
| UltimateDO: An Efficient Framework to Marry Occupancy Prediction with 3D Object Detection via Channel2height | Sep 17, 2024 | 3D Object DetectionAutonomous Driving | CodeCode Available | 4 | 5 |
| SiamMask: A Framework for Fast Online Object Tracking and Segmentation | Jul 5, 2022 | Multiple Object TrackingObject | CodeCode Available | 4 | 5 |
| AnyDoor: Zero-shot Object-level Image Customization | Jul 18, 2023 | ObjectVirtual Try-on | CodeCode Available | 4 | 5 |
| Detectron2 Object Detection & Manipulating Images using Cartoonization | Aug 1, 2021 | Autonomous VehiclesData Visualization | CodeCode Available | 4 | 5 |
| RUMI: Rummaging Using Mutual Information | Aug 19, 2024 | Model Predictive ControlObject | CodeCode Available | 4 | 5 |
| SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory Tree | Oct 21, 2024 | Heuristic SearchObject | CodeCode Available | 4 | 5 |
| RTMDet: An Empirical Study of Designing Real-Time Object Detectors | Dec 14, 2022 | GPUInstance Segmentation | CodeCode Available | 4 | 5 |
| SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection | Mar 11, 2024 | 2D Object Detection2k | CodeCode Available | 4 | 5 |
| OmDet: Large-scale vision-language multi-dataset pre-training with multimodal detection network | Sep 10, 2022 | Continual LearningObject | CodeCode Available | 3 | 5 |
| Observation-Centric SORT: Rethinking SORT for Robust Multi-Object Tracking | Mar 27, 2022 | CPUMulti-Object Tracking | CodeCode Available | 3 | 5 |
| MureObjectStitch: Multi-reference Image Composition | Nov 12, 2024 | Object | CodeCode Available | 3 | 5 |
| Multiple Object Tracking as ID Prediction | Mar 25, 2024 | Multi-Object TrackingMultiple Object Tracking | CodeCode Available | 3 | 5 |
| NeROIC: Neural Rendering of Objects from Online Image Collections | Jan 7, 2022 | Neural RenderingNovel View Synthesis | CodeCode Available | 3 | 5 |
| Panacea+: Panoramic and Controllable Video Generation for Autonomous Driving | Aug 14, 2024 | 3D Object Detection3D Object Tracking | CodeCode Available | 3 | 5 |
| Motion Representations for Articulated Animation | Apr 22, 2021 | ObjectVideo Reconstruction | CodeCode Available | 3 | 5 |
| MotionCtrl: A Unified and Flexible Motion Controller for Video Generation | Dec 6, 2023 | ObjectVideo Generation | CodeCode Available | 3 | 5 |
| Moving Object Segmentation: All You Need Is SAM (and Flow) | Apr 18, 2024 | AllMotion Segmentation | CodeCode Available | 3 | 5 |
| MagicDrive: Street View Generation with Diverse 3D Geometry Control | Oct 4, 2023 | 3D geometry3D Object Detection | CodeCode Available | 3 | 5 |
| InterMimic: Towards Universal Whole-Body Control for Physics-Based Human-Object Interactions | Feb 27, 2025 | Human-Object Interaction DetectionObject | CodeCode Available | 3 | 5 |
| BundleSDF: Neural 6-DoF Tracking and 3D Reconstruction of Unknown Objects | Mar 24, 2023 | 3D Object Detection3D Object Tracking | CodeCode Available | 3 | 5 |
| Locate Anything on Earth: Advancing Open-Vocabulary Object Detection for Remote Sensing Community | Aug 17, 2024 | Novel ConceptsObject | CodeCode Available | 3 | 5 |
| MTP: Advancing Remote Sensing Foundation Model via Multi-Task Pretraining | Mar 20, 2024 | Aerial Scene ClassificationBuilding change detection for remote sensing images | CodeCode Available | 3 | 5 |
| PETR: Position Embedding Transformation for Multi-View 3D Object Detection | Mar 10, 2022 | 3D Object DetectionObject | CodeCode Available | 3 | 5 |