OV-DQUO: Open-Vocabulary DETR with Denoising Text Query Training and Open-World Unknown Objects Supervision
Junjie Wang, Bin Chen, Bin Kang, Yulin Li, YiChi Chen, Weizhi Xian, Huifeng Chang, Yong Xu
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/xiaomoguhz/ov-dquoOfficialIn paperpytorch★ 36
Abstract
Open-vocabulary detection aims to detect objects from novel categories beyond the base categories on which the detector is trained. However, existing open-vocabulary detectors trained on base category data tend to assign higher confidence to trained categories and confuse novel categories with the background. To resolve this, we propose OV-DQUO, an Open-Vocabulary DETR with Denoising text Query training and open-world Unknown Objects supervision. Specifically, we introduce a wildcard matching method. This method enables the detector to learn from pairs of unknown objects recognized by the open-world detector and text embeddings with general semantics, mitigating the confidence bias between base and novel categories. Additionally, we propose a denoising text query training strategy. It synthesizes foreground and background query-box pairs from open-world unknown objects to train the detector through contrastive learning, enhancing its ability to distinguish novel objects from the background. We conducted extensive experiments on the challenging OV-COCO and OV-LVIS benchmarks, achieving new state-of-the-art results of 45.6 AP50 and 39.3 mAP on novel categories respectively, without the need for additional training data. Models and code are released at https://github.com/xiaomoguhz/OV-DQUO
Tasks
Benchmark Results
| Dataset | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| LVIS v1.0 | OV-DQUO(ViT-L/14) | AP novel-LVIS base training | 39.3 | — | Unverified |
| LVIS v1.0 | OV-DQUO(ViT-B/16) | AP novel-LVIS base training | 29.7 | — | Unverified |
| MSCOCO | OV-DQUO(RN50x4) | AP 0.5 | 45.6 | — | Unverified |
| MSCOCO | OV-DQUO(R50) | AP 0.5 | 39.2 | — | Unverified |