| EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything | Dec 1, 2023 | Decoderimage-classification | CodeCode Available | 4 |
| Vision GNN: An Image is Worth Graph of Nodes | Jun 1, 2022 | Image ClassificationObject Detection | CodeCode Available | 4 |
| TUMTraf V2X Cooperative Perception Dataset | Mar 2, 2024 | 3D Object DetectionAutonomous Vehicles | CodeCode Available | 4 |
| DN-DETR: Accelerate DETR Training by Introducing Query DeNoising | Mar 2, 2022 | DecoderObject Detection | CodeCode Available | 4 |
| UltimateDO: An Efficient Framework to Marry Occupancy Prediction with 3D Object Detection via Channel2height | Sep 17, 2024 | 3D Object DetectionAutonomous Driving | CodeCode Available | 4 |
| Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement | Mar 9, 2025 | Domain GeneralizationObject Detection | CodeCode Available | 4 |
| Detectron2 Object Detection & Manipulating Images using Cartoonization | Aug 1, 2021 | Autonomous VehiclesData Visualization | CodeCode Available | 4 |
| SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection | Mar 11, 2024 | 2D Object Detection2k | CodeCode Available | 4 |
| RTMDet: An Empirical Study of Designing Real-Time Object Detectors | Dec 14, 2022 | GPUInstance Segmentation | CodeCode Available | 4 |
| DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection | Mar 7, 2022 | Object DetectionReal-Time Object Detection | CodeCode Available | 4 |
| Deep Residual Learning for Image Recognition | Dec 10, 2015 | Classification | CodeCode Available | 4 |
| Architecture-Agnostic Masked Image Modeling -- From ViT back to CNN | May 27, 2022 | Image ClassificationInstance Segmentation | CodeCode Available | 4 |
| FG-CLIP: Fine-Grained Visual and Textual Alignment | May 8, 2025 | Image-text Retrievalobject-detection | CodeCode Available | 4 |
| Strip R-CNN: Large Strip Convolution for Remote Sensing Object Detection | Jan 7, 2025 | Objectobject-detection | CodeCode Available | 4 |
| Relation DETR: Exploring Explicit Position Relation Prior for Object Detection | Jul 16, 2024 | 2D Object Detectionobject-detection | CodeCode Available | 3 |
| ResNeSt: Split-Attention Networks | Apr 19, 2020 | image-classificationImage Classification | CodeCode Available | 3 |
| Cut and Learn for Unsupervised Object Detection and Instance Segmentation | Jan 26, 2023 | Instance Segmentationobject-detection | CodeCode Available | 3 |
| Cubify Anything: Scaling Indoor 3D Object Detection | Dec 5, 2024 | 3D Object DetectionObject | CodeCode Available | 3 |
| Rethinking the Evaluation of Visible and Infrared Image Fusion | Oct 9, 2024 | object-detectionObject Detection | CodeCode Available | 3 |
| Practical Video Object Detection via Feature Selection and Aggregation | Jul 29, 2024 | feature selectionGPU | CodeCode Available | 3 |
| Cross Modal Transformer: Towards Fast and Robust 3D Object Detection | Jan 3, 2023 | 3D Object Detectionobject-detection | CodeCode Available | 3 |
| RCBEVDet: Radar-camera Fusion in Bird's Eye View for 3D Object Detection | Mar 25, 2024 | 3D Object Detection3D Object Detection (RoI) | CodeCode Available | 3 |
| Playing Non-Embedded Card-Based Games with Reinforcement Learning | Apr 7, 2025 | Board GamesDecision Making | CodeCode Available | 3 |
| PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition | Mar 26, 2024 | Image ClassificationInstance Segmentation | CodeCode Available | 3 |
| Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields | Nov 24, 2016 | 2D Human Pose Estimation2D Pose Estimation | CodeCode Available | 3 |
| ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders | Jan 2, 2023 | Object DetectionRepresentation Learning | CodeCode Available | 3 |
| OVLW-DETR: Open-Vocabulary Light-Weighted Detection Transformer | Jul 15, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Panacea+: Panoramic and Controllable Video Generation for Autonomous Driving | Aug 14, 2024 | 3D Object Detection3D Object Tracking | CodeCode Available | 3 |
| PETR: Position Embedding Transformation for Multi-View 3D Object Detection | Mar 10, 2022 | 3D Object DetectionObject | CodeCode Available | 3 |
| Open-YOLO 3D: Towards Fast and Accurate Open-Vocabulary 3D Instance Segmentation | Jun 4, 2024 | 2D Object Detection3D Instance Segmentation | CodeCode Available | 3 |
| OmDet: Large-scale vision-language multi-dataset pre-training with multimodal detection network | Sep 10, 2022 | Continual LearningObject | CodeCode Available | 3 |
| OrionBench: A Benchmark for Chart and Human-Recognizable Object Detection in Infographics | May 23, 2025 | Chart Understandingobject-detection | CodeCode Available | 3 |
| Multiple Object Tracking as ID Prediction | Mar 25, 2024 | Multi-Object TrackingMultiple Object Tracking | CodeCode Available | 3 |
| MMLSpark: Unifying Machine Learning Ecosystems at Massive Scales | Oct 20, 2018 | BIG-bench Machine LearningDistributed Computing | CodeCode Available | 3 |
| Observation-Centric SORT: Rethinking SORT for Robust Multi-Object Tracking | Mar 27, 2022 | CPUMulti-Object Tracking | CodeCode Available | 3 |
| OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion | Jul 10, 2024 | Object DetectionZero-Shot Object Detection | CodeCode Available | 3 |
| PETRv2: A Unified Framework for 3D Perception from Multi-Camera Images | Jun 2, 2022 | 3D Lane Detection3D Object Detection | CodeCode Available | 3 |
| Revisiting Image Pyramid Structure for High Resolution Salient Object Detection | Sep 20, 2022 | Dichotomous Image SegmentationObject Detection | CodeCode Available | 3 |
| MagicDrive: Street View Generation with Diverse 3D Geometry Control | Oct 4, 2023 | 3D geometry3D Object Detection | CodeCode Available | 3 |
| Locate Anything on Earth: Advancing Open-Vocabulary Object Detection for Remote Sensing Community | Aug 17, 2024 | Novel ConceptsObject | CodeCode Available | 3 |
| LION: Linear Group RNN for 3D Object Detection in Point Clouds | Jul 25, 2024 | 3D Object DetectionLong-range modeling | CodeCode Available | 3 |
| Bridging the Gap Between Anchor-based and Anchor-free Detection via Adaptive Training Sample Selection | Dec 5, 2019 | Objectobject-detection | CodeCode Available | 3 |
| IS-Fusion: Instance-Scene Collaborative Fusion for Multimodal 3D Object Detection | Mar 22, 2024 | 3D Object DetectionAutonomous Driving | CodeCode Available | 3 |
| Leveraging Vision-Centric Multi-Modal Expertise for 3D Object Detection | Oct 24, 2023 | 3D Object Detectionobject-detection | CodeCode Available | 3 |
| Cosmos-Drive-Dreams: Scalable Synthetic Driving Data Generation with World Foundation Models | Jun 10, 2025 | 3D Lane Detection3D Object Detection | CodeCode Available | 3 |
| How to Evaluate the Generalization of Detection? A Benchmark for Comprehensive Open-Vocabulary Detection | Aug 25, 2023 | Object Detection | CodeCode Available | 3 |
| Collaborative Novel Object Discovery and Box-Guided Cross-Modal Alignment for Open-Vocabulary 3D Object Detection | Jun 2, 2024 | 3D Object Detectioncross-modal alignment | CodeCode Available | 3 |
| MTP: Advancing Remote Sensing Foundation Model via Multi-Task Pretraining | Mar 20, 2024 | Aerial Scene ClassificationBuilding change detection for remote sensing images | CodeCode Available | 3 |
| Hyper-YOLO: When Visual Object Detection Meets Hypergraph Computation | Aug 9, 2024 | object-detectionObject Detection | CodeCode Available | 3 |
| A Comparative Analysis of Object Detection Metrics with a Companion Open-Source Toolkit | Jan 25, 2021 | Objectobject-detection | CodeCode Available | 3 |