| YOLOv10: Real-Time End-to-End Object Detection | May 23, 2024 | 2D Object DetectionData Augmentation | CodeCode Available | 11 |
| TripoSR: Fast 3D Object Reconstruction from a Single Image | Mar 4, 2024 | 3D Generation3D Object Reconstruction | CodeCode Available | 9 |
| YOLO-World: Real-Time Open-Vocabulary Object Detection | Jan 30, 2024 | Instance SegmentationLanguage Modeling | CodeCode Available | 9 |
| DETRs Beat YOLOs on Real-time Object Detection | Apr 17, 2023 | 2D Object DetectionDecoder | CodeCode Available | 8 |
| Visual-RFT: Visual Reinforcement Fine-Tuning | Mar 3, 2025 | Few-Shot Object DetectionFine-Grained Image Classification | CodeCode Available | 7 |
| YOLOv12: Attention-Centric Real-Time Object Detectors | Feb 18, 2025 | GPUObject | CodeCode Available | 7 |
| Efficient Track Anything | Nov 28, 2024 | ObjectSegmentation | CodeCode Available | 7 |
| T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy | Mar 21, 2024 | Contrastive LearningDescriptive | CodeCode Available | 7 |
| DragAnything: Motion Control for Anything using Entity Representation | Mar 12, 2024 | ObjectVideo Generation | CodeCode Available | 7 |
| YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors | Jul 6, 2022 | 2D Object DetectionGPU | CodeCode Available | 7 |
| UnCommon Objects in 3D | Jan 13, 2025 | Object | CodeCode Available | 5 |
| Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos | Jan 7, 2025 | 2kLanguage Modeling | CodeCode Available | 5 |
| DINO-X: A Unified Vision Model for Open-World Object Detection and Understanding | Nov 21, 2024 | Long-tailed Object DetectionObject | CodeCode Available | 5 |
| Underwater Camouflaged Object Tracking Meets Vision-Language SAM2 | Sep 25, 2024 | ObjectObject Tracking | CodeCode Available | 5 |
| Matching Anything by Segmenting Anything | Jun 6, 2024 | Domain GeneralizationMultiple Object Tracking | CodeCode Available | 5 |
| Awesome Multi-modal Object Tracking | May 23, 2024 | Autonomous DrivingKnowledge Distillation | CodeCode Available | 5 |
| GaussianObject: High-Quality 3D Object Reconstruction from Four Views with Gaussian Splatting | Feb 15, 2024 | 3D Object ReconstructionNeural Rendering | CodeCode Available | 5 |
| Street Gaussians: Modeling Dynamic Urban Scenes with Gaussian Splatting | Jan 2, 2024 | Autonomous DrivingNeRF | CodeCode Available | 5 |
| RealFusion: 360° Reconstruction of Any Object from a Single Image | Feb 21, 2023 | 3D ReconstructionObject | CodeCode Available | 5 |
| Slicing Aided Hyper Inference and Fine-tuning for Small Object Detection | Feb 14, 2022 | Objectobject-detection | CodeCode Available | 5 |
| Efficient Part-level 3D Object Generation via Dual Volume Packing | Jun 11, 2025 | DiversityObject | CodeCode Available | 4 |
| Strip R-CNN: Large Strip Convolution for Remote Sensing Object Detection | Jan 7, 2025 | Objectobject-detection | CodeCode Available | 4 |
| SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory Tree | Oct 21, 2024 | Heuristic SearchObject | CodeCode Available | 4 |
| UltimateDO: An Efficient Framework to Marry Occupancy Prediction with 3D Object Detection via Channel2height | Sep 17, 2024 | 3D Object DetectionAutonomous Driving | CodeCode Available | 4 |
| RUMI: Rummaging Using Mutual Information | Aug 19, 2024 | Model Predictive ControlObject | CodeCode Available | 4 |
| Mamba YOLO: A Simple Baseline for Object Detection with State Space Model | Jun 9, 2024 | GPUMamba | CodeCode Available | 4 |
| SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection | Mar 11, 2024 | 2D Object Detection2k | CodeCode Available | 4 |
| Transformer for Object Re-Identification: A Survey | Jan 13, 2024 | ObjectSurvey | CodeCode Available | 4 |
| PAIR Diffusion: A Comprehensive Multimodal Object-Level Image Editor | Jan 1, 2024 | Object | CodeCode Available | 4 |
| AnyDoor: Zero-shot Object-level Image Customization | Jul 18, 2023 | ObjectVirtual Try-on | CodeCode Available | 4 |
| RTMDet: An Empirical Study of Designing Real-Time Object Detectors | Dec 14, 2022 | GPUInstance Segmentation | CodeCode Available | 4 |
| DiffusionDet: Diffusion Model for Object Detection | Nov 17, 2022 | Denoisingmodel | CodeCode Available | 4 |
| SiamMask: A Framework for Fast Online Object Tracking and Segmentation | Jul 5, 2022 | Multiple Object TrackingObject | CodeCode Available | 4 |
| GCoNet+: A Stronger Group Collaborative Co-Salient Object Detector | May 30, 2022 | Co-Salient Object DetectionObject | CodeCode Available | 4 |
| Detectron2 Object Detection & Manipulating Images using Cartoonization | Aug 1, 2021 | Autonomous VehiclesData Visualization | CodeCode Available | 4 |
| Playing Non-Embedded Card-Based Games with Reinforcement Learning | Apr 7, 2025 | Board GamesDecision Making | CodeCode Available | 3 |
| InterMimic: Towards Universal Whole-Body Control for Physics-Based Human-Object Interactions | Feb 27, 2025 | Human-Object Interaction DetectionObject | CodeCode Available | 3 |
| CrossOver: 3D Scene Cross-Modal Alignment | Feb 20, 2025 | cross-modal alignmentObject | CodeCode Available | 3 |
| VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM | Dec 31, 2024 | ObjectVideo Understanding | CodeCode Available | 3 |
| Attentive Eraser: Unleashing Diffusion Model's Object Removal Potential via Self-Attention Redirection Guidance | Dec 17, 2024 | Image GenerationObject | CodeCode Available | 3 |
| Cubify Anything: Scaling Indoor 3D Object Detection | Dec 5, 2024 | 3D Object DetectionObject | CodeCode Available | 3 |
| MureObjectStitch: Multi-reference Image Composition | Nov 12, 2024 | Object | CodeCode Available | 3 |
| A Survey of Camouflaged Object Detection and Beyond | Aug 26, 2024 | Instance SegmentationObject | CodeCode Available | 3 |
| A Survey of Embodied Learning for Object-Centric Robotic Manipulation | Aug 21, 2024 | Imitation LearningObject | CodeCode Available | 3 |
| Locate Anything on Earth: Advancing Open-Vocabulary Object Detection for Remote Sensing Community | Aug 17, 2024 | Novel ConceptsObject | CodeCode Available | 3 |
| Panacea+: Panoramic and Controllable Video Generation for Autonomous Driving | Aug 14, 2024 | 3D Object Detection3D Object Tracking | CodeCode Available | 3 |
| Practical Video Object Detection via Feature Selection and Aggregation | Jul 29, 2024 | feature selectionGPU | CodeCode Available | 3 |
| Diffree: Text-Guided Shape Free Object Inpainting with Diffusion Model | Jul 24, 2024 | Image InpaintingObject | CodeCode Available | 3 |
| VISA: Reasoning Video Object Segmentation via Large Language Models | Jul 16, 2024 | DecoderObject | CodeCode Available | 3 |
| Collaborative Novel Object Discovery and Box-Guided Cross-Modal Alignment for Open-Vocabulary 3D Object Detection | Jun 2, 2024 | 3D Object Detectioncross-modal alignment | CodeCode Available | 3 |