| ConvNeXt V2: Co-designing and Scaling ConvNets with Masked Autoencoders | Jan 2, 2023 | Object DetectionRepresentation Learning | CodeCode Available | 3 |
| OVLW-DETR: Open-Vocabulary Light-Weighted Detection Transformer | Jul 15, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Panacea+: Panoramic and Controllable Video Generation for Autonomous Driving | Aug 14, 2024 | 3D Object Detection3D Object Tracking | CodeCode Available | 3 |
| PETR: Position Embedding Transformation for Multi-View 3D Object Detection | Mar 10, 2022 | 3D Object DetectionObject | CodeCode Available | 3 |
| Open-YOLO 3D: Towards Fast and Accurate Open-Vocabulary 3D Instance Segmentation | Jun 4, 2024 | 2D Object Detection3D Instance Segmentation | CodeCode Available | 3 |
| OmDet: Large-scale vision-language multi-dataset pre-training with multimodal detection network | Sep 10, 2022 | Continual LearningObject | CodeCode Available | 3 |
| OrionBench: A Benchmark for Chart and Human-Recognizable Object Detection in Infographics | May 23, 2025 | Chart Understandingobject-detection | CodeCode Available | 3 |
| Multiple Object Tracking as ID Prediction | Mar 25, 2024 | Multi-Object TrackingMultiple Object Tracking | CodeCode Available | 3 |
| MMLSpark: Unifying Machine Learning Ecosystems at Massive Scales | Oct 20, 2018 | BIG-bench Machine LearningDistributed Computing | CodeCode Available | 3 |
| Observation-Centric SORT: Rethinking SORT for Robust Multi-Object Tracking | Mar 27, 2022 | CPUMulti-Object Tracking | CodeCode Available | 3 |
| OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion | Jul 10, 2024 | Object DetectionZero-Shot Object Detection | CodeCode Available | 3 |
| PETRv2: A Unified Framework for 3D Perception from Multi-Camera Images | Jun 2, 2022 | 3D Lane Detection3D Object Detection | CodeCode Available | 3 |
| Revisiting Image Pyramid Structure for High Resolution Salient Object Detection | Sep 20, 2022 | Dichotomous Image SegmentationObject Detection | CodeCode Available | 3 |
| MagicDrive: Street View Generation with Diverse 3D Geometry Control | Oct 4, 2023 | 3D geometry3D Object Detection | CodeCode Available | 3 |
| Locate Anything on Earth: Advancing Open-Vocabulary Object Detection for Remote Sensing Community | Aug 17, 2024 | Novel ConceptsObject | CodeCode Available | 3 |
| LION: Linear Group RNN for 3D Object Detection in Point Clouds | Jul 25, 2024 | 3D Object DetectionLong-range modeling | CodeCode Available | 3 |
| Bridging the Gap Between Anchor-based and Anchor-free Detection via Adaptive Training Sample Selection | Dec 5, 2019 | Objectobject-detection | CodeCode Available | 3 |
| IS-Fusion: Instance-Scene Collaborative Fusion for Multimodal 3D Object Detection | Mar 22, 2024 | 3D Object DetectionAutonomous Driving | CodeCode Available | 3 |
| Leveraging Vision-Centric Multi-Modal Expertise for 3D Object Detection | Oct 24, 2023 | 3D Object Detectionobject-detection | CodeCode Available | 3 |
| Cosmos-Drive-Dreams: Scalable Synthetic Driving Data Generation with World Foundation Models | Jun 10, 2025 | 3D Lane Detection3D Object Detection | CodeCode Available | 3 |
| How to Evaluate the Generalization of Detection? A Benchmark for Comprehensive Open-Vocabulary Detection | Aug 25, 2023 | Object Detection | CodeCode Available | 3 |
| Collaborative Novel Object Discovery and Box-Guided Cross-Modal Alignment for Open-Vocabulary 3D Object Detection | Jun 2, 2024 | 3D Object Detectioncross-modal alignment | CodeCode Available | 3 |
| MTP: Advancing Remote Sensing Foundation Model via Multi-Task Pretraining | Mar 20, 2024 | Aerial Scene ClassificationBuilding change detection for remote sensing images | CodeCode Available | 3 |
| Hyper-YOLO: When Visual Object Detection Meets Hypergraph Computation | Aug 9, 2024 | object-detectionObject Detection | CodeCode Available | 3 |
| A Comparative Analysis of Object Detection Metrics with a Companion Open-Source Toolkit | Jan 25, 2021 | Objectobject-detection | CodeCode Available | 3 |