| RT-DETRv3: Real-time End-to-End Object Detection with Hierarchical Dense Positive Supervision | Sep 13, 2024 | Decoderobject-detection | CodeCode Available | 3 | 5 |
| A Comparative Analysis of Object Detection Metrics with a Companion Open-Source Toolkit | Jan 25, 2021 | Objectobject-detection | CodeCode Available | 3 | 5 |
| Practical Video Object Detection via Feature Selection and Aggregation | Jul 29, 2024 | feature selectionGPU | CodeCode Available | 3 | 5 |
| PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition | Mar 26, 2024 | Image ClassificationInstance Segmentation | CodeCode Available | 3 | 5 |
| Theoretically Achieving Continuous Representation of Oriented Bounding Boxes | Feb 29, 2024 | Fairnessobject-detection | CodeCode Available | 3 | 5 |
| Towards Automatic Power Battery Detection: New Challenge Benchmark Dataset and Baseline | Jan 1, 2024 | Crowd Countingobject-detection | CodeCode Available | 3 | 5 |
| PETRv2: A Unified Framework for 3D Perception from Multi-Camera Images | Jun 2, 2022 | 3D Lane Detection3D Object Detection | CodeCode Available | 3 | 5 |
| Playing Non-Embedded Card-Based Games with Reinforcement Learning | Apr 7, 2025 | Board GamesDecision Making | CodeCode Available | 3 | 5 |
| Detect Anything 3D in the Wild | Apr 10, 2025 | 3D Object DetectionAutonomous Driving | CodeCode Available | 3 | 5 |
| Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling | Jan 9, 2023 | 2D Object DetectionContrastive Learning | CodeCode Available | 3 | 5 |
| RCBEVDet: Radar-camera Fusion in Bird's Eye View for 3D Object Detection | Mar 25, 2024 | 3D Object Detection3D Object Detection (RoI) | CodeCode Available | 3 | 5 |
| Panacea+: Panoramic and Controllable Video Generation for Autonomous Driving | Aug 14, 2024 | 3D Object Detection3D Object Tracking | CodeCode Available | 3 | 5 |
| Collaborative Novel Object Discovery and Box-Guided Cross-Modal Alignment for Open-Vocabulary 3D Object Detection | Jun 2, 2024 | 3D Object Detectioncross-modal alignment | CodeCode Available | 3 | 5 |
| Vision-Language Pre-training: Basics, Recent Advances, and Future Trends | Oct 17, 2022 | Few-Shot LearningImage Captioning | CodeCode Available | 3 | 5 |
| OVLW-DETR: Open-Vocabulary Light-Weighted Detection Transformer | Jul 15, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 | 5 |
| PETR: Position Embedding Transformation for Multi-View 3D Object Detection | Mar 10, 2022 | 3D Object DetectionObject | CodeCode Available | 3 | 5 |
| Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields | Nov 24, 2016 | 2D Human Pose Estimation2D Pose Estimation | CodeCode Available | 3 | 5 |
| Vision Transformers: From Semantic Segmentation to Dense Prediction | Jul 19, 2022 | image-classificationImage Classification | CodeCode Available | 3 | 5 |
| Observation-Centric SORT: Rethinking SORT for Robust Multi-Object Tracking | Mar 27, 2022 | CPUMulti-Object Tracking | CodeCode Available | 3 | 5 |
| OmDet: Large-scale vision-language multi-dataset pre-training with multimodal detection network | Sep 10, 2022 | Continual LearningObject | CodeCode Available | 3 | 5 |
| MTP: Advancing Remote Sensing Foundation Model via Multi-Task Pretraining | Mar 20, 2024 | Aerial Scene ClassificationBuilding change detection for remote sensing images | CodeCode Available | 3 | 5 |
| MMLSpark: Unifying Machine Learning Ecosystems at Massive Scales | Oct 20, 2018 | BIG-bench Machine LearningDistributed Computing | CodeCode Available | 3 | 5 |
| Multiple Object Tracking as ID Prediction | Mar 25, 2024 | Multi-Object TrackingMultiple Object Tracking | CodeCode Available | 3 | 5 |
| Open-YOLO 3D: Towards Fast and Accurate Open-Vocabulary 3D Instance Segmentation | Jun 4, 2024 | 2D Object Detection3D Instance Segmentation | CodeCode Available | 3 | 5 |
| Bridging the Gap Between Anchor-based and Anchor-free Detection via Adaptive Training Sample Selection | Dec 5, 2019 | Objectobject-detection | CodeCode Available | 3 | 5 |
| MaxViT: Multi-Axis Vision Transformer | Apr 4, 2022 | image-classificationImage Classification | CodeCode Available | 3 | 5 |
| Locate Anything on Earth: Advancing Open-Vocabulary Object Detection for Remote Sensing Community | Aug 17, 2024 | Novel ConceptsObject | CodeCode Available | 3 | 5 |
| Leveraging Vision-Centric Multi-Modal Expertise for 3D Object Detection | Oct 24, 2023 | 3D Object Detectionobject-detection | CodeCode Available | 3 | 5 |
| LION: Linear Group RNN for 3D Object Detection in Point Clouds | Jul 25, 2024 | 3D Object DetectionLong-range modeling | CodeCode Available | 3 | 5 |
| MagicDrive: Street View Generation with Diverse 3D Geometry Control | Oct 4, 2023 | 3D geometry3D Object Detection | CodeCode Available | 3 | 5 |
| OrionBench: A Benchmark for Chart and Human-Recognizable Object Detection in Infographics | May 23, 2025 | Chart Understandingobject-detection | CodeCode Available | 3 | 5 |
| How to Evaluate the Generalization of Detection? A Benchmark for Comprehensive Open-Vocabulary Detection | Aug 25, 2023 | Object Detection | CodeCode Available | 3 | 5 |
| BEVDet4D: Exploit Temporal Cues in Multi-camera 3D Object Detection | Mar 31, 2022 | 3D Object Detectionobject-detection | CodeCode Available | 3 | 5 |
| Integer-Valued Training and Spike-Driven Inference Spiking Neural Network for High-performance and Energy-efficient Object Detection | Jul 30, 2024 | object-detectionObject Detection | CodeCode Available | 3 | 5 |
| Bag of Freebies for Training Object Detection Neural Networks | Feb 11, 2019 | General Classificationimage-classification | CodeCode Available | 3 | 5 |
| Geometric-aware Pretraining for Vision-centric 3D Object Detection | Apr 6, 2023 | 3D Object DetectionAutonomous Driving | CodeCode Available | 3 | 5 |
| 5%>100%: Breaking Performance Shackles of Full Fine-Tuning on Visual Recognition Tasks | Aug 15, 2024 | image-classificationImage Classification | CodeCode Available | 3 | 5 |
| General Object Foundation Model for Images and Videos at Scale | Dec 14, 2023 | Instance SegmentationLong-tail Video Object Segmentation | CodeCode Available | 3 | 5 |
| A Survey of Camouflaged Object Detection and Beyond | Aug 26, 2024 | Instance SegmentationObject | CodeCode Available | 3 | 5 |
| Frequency Dynamic Convolution for Dense Image Prediction | Mar 24, 2025 | object-detectionObject Detection | CodeCode Available | 3 | 5 |
| EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation | Mar 22, 2023 | 3D Object Detection6D Pose Estimation using RGB | CodeCode Available | 3 | 5 |
| Hyper-YOLO: When Visual Object Detection Meets Hypergraph Computation | Aug 9, 2024 | object-detectionObject Detection | CodeCode Available | 3 | 5 |
| Falcon: A Remote Sensing Vision-Language Foundation Model | Mar 14, 2025 | Image Captioningimage-classification | CodeCode Available | 3 | 5 |
| A Survey on Performance Metrics for Object-Detection Algorithms | Jul 21, 2020 | BenchmarkingObject | CodeCode Available | 3 | 5 |
| AM-RADIO: Agglomerative Vision Foundation Model -- Reduce All Domains Into One | Dec 10, 2023 | AllBenchmarking | CodeCode Available | 3 | 5 |
| EfficientDet: Scalable and Efficient Object Detection | Nov 20, 2019 | AutoMLObject | CodeCode Available | 3 | 5 |
| Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection | Jun 8, 2020 | Dense Object DetectionGeneral Classification | CodeCode Available | 3 | 5 |
| IS-Fusion: Instance-Scene Collaborative Fusion for Multimodal 3D Object Detection | Mar 22, 2024 | 3D Object DetectionAutonomous Driving | CodeCode Available | 3 | 5 |
| OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion | Jul 10, 2024 | Object DetectionZero-Shot Object Detection | CodeCode Available | 3 | 5 |
| Salience DETR: Enhancing Detection Transformer with Hierarchical Salience Filtering Refinement | Mar 24, 2024 | 2D Object DetectionComputational Efficiency | CodeCode Available | 3 | 5 |