A Strong and Reproducible Object Detector with Only Public Datasets
Tianhe Ren, Jianwei Yang, Shilong Liu, Ailing Zeng, Feng Li, Hao Zhang, Hongyang Li, Zhaoyang Zeng, Lei Zhang
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/microsoft/FocalNetOfficialIn paperpytorch★ 750
- github.com/idea-research/stabledinoIn paperpytorch★ 239
- github.com/idea-research/stable-dinopytorch★ 239
Abstract
This work presents Focal-Stable-DINO, a strong and reproducible object detection model which achieves 64.6 AP on COCO val2017 and 64.8 AP on COCO test-dev using only 700M parameters without any test time augmentation. It explores the combination of the powerful FocalNet-Huge backbone with the effective Stable-DINO detector. Different from existing SOTA models that utilize an extensive number of parameters and complex training techniques on large-scale private data or merged data, our model is exclusively trained on the publicly available dataset Objects365, which ensures the reproducibility of our approach.
Tasks
Benchmark Results
| Dataset | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| COCO minival | Focal-Stable-DINO (Focal-Huge, no TTA) | box AP | 64.6 | — | Unverified |
| COCO test-dev | Focal-Stable-DINO (Focal-Huge, no TTA) | box mAP | 64.8 | — | Unverified |