Open Vocabulary Object Detection

Open-vocabulary detection (OVD) aims to generalize beyond the limited number of base classes labeled during the training phase. The goal is to detect novel classes defined by an unbounded (open) vocabulary at inference.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 76–100 of 145 papers

Title	Date	Tasks	Status	Hype	Score
Comprehensive Multi-Modal Prototypes are Simple and Effective Classifiers for Vast-Vocabulary Object Detection	Dec 23, 2024	object-detectionObject Detection	CodeCode Available	1	5
Localized Vision-Language Matching for Open-vocabulary Object Detection	May 12, 2022	Language ModelingLanguage Modelling	CodeCode Available	1	5
LP-OVOD: Open-Vocabulary Object Detection by Linear Probing	Oct 26, 2023	Objectobject-detection	CodeCode Available	1	5
Aligning Bag of Regions for Open-Vocabulary Object Detection	Feb 27, 2023	Objectobject-detection	CodeCode Available	1	5
MarvelOVD: Marrying Object Recognition and Vision-Language Models for Robust Open-Vocabulary Object Detection	Jul 31, 2024	Language ModellingObject	CodeCode Available	1	5
Meta-Adapter: An Online Few-shot Learner for Vision-Language Model	Nov 7, 2023	Few-Shot Learningimage-classification	CodeCode Available	1	5
A Lightweight Modular Framework for Low-Cost Open-Vocabulary Object Detection Training	Aug 20, 2024	Autonomous VehiclesComputational Efficiency	CodeCode Available	0	5
F-VLM: Open-Vocabulary Object Detection upon Frozen Vision and Language Models	Sep 30, 2022	Knowledge Distillationobject-detection	CodeCode Available	0	5
Cyclic Contrastive Knowledge Transfer for Open-Vocabulary Object Detection	Mar 14, 2025	object-detectionObject Detection	CodeCode Available	0	5
LED: LLM Enhanced Open-Vocabulary Object Detection without Human Curated Data Generation	Mar 18, 2025	DecoderObject	CodeCode Available	0	5
Scaling Open-Vocabulary Object Detection	Jun 16, 2023	image-classificationImage Classification	CodeCode Available	0	5
Generating Enhanced Negatives for Training Language-Based Object Detectors	Dec 29, 2023	Objectobject-detection	CodeCode Available	0	5
Region-centric Image-Language Pretraining for Open-Vocabulary Detection	Sep 29, 2023	Contrastive LearningObject	CodeCode Available	0	5
MaMMUT: A Simple Architecture for Joint Learning for MultiModal Tasks	Mar 29, 2023	Cross-Modal RetrievalDecoder	CodeCode Available	0	5
Unconstrained Open Vocabulary Image Classification: Zero-Shot Transfer from Text to Image via CLIP Inversion	Jul 15, 2024	image-classificationImage Classification	CodeCode Available	0	5
Open-vocabulary vs. Closed-set: Best Practice for Few-shot Object Detection Considering Text Describability	Oct 20, 2024	Few-Shot Object Detectionimage-classification	CodeCode Available	0	5
Simple Open-Vocabulary Object Detection with Vision Transformers	May 12, 2022	Described Object Detectionimage-classification	CodeCode Available	0	5
Open-Vocabulary Object Detection via Scene Graph Discovery	Jul 7, 2023	DecoderGraph Generation	—Unverified	0	0
An Application-Agnostic Automatic Target Recognition System Using Vision Language Models	Nov 5, 2024	object-detectionObject Detection	—Unverified	0	0
An Iterative Feedback Mechanism for Improving Natural Language Class Descriptions in Open-Vocabulary Object Detection	Mar 21, 2025	object-detectionObject Detection	—Unverified	0	0
ATAS: Any-to-Any Self-Distillation for Enhanced Open-Vocabulary Dense Prediction	Jun 10, 2025	object-detectionObject Detection	—Unverified	0	0
BACON: Improving Clarity of Image Captions via Bag-of-Concept Graphs	Jul 3, 2024	Image CaptioningImage Generation	—Unverified	0	0
Boosting Open-Vocabulary Object Detection by Handling Background Samples	Oct 11, 2024	object-detectionObject Detection	—Unverified	0	0
Contrastive Feature Masking Open-Vocabulary Vision Transformer	Sep 2, 2023	Contrastive LearningImage-text Retrieval	—Unverified	0	0
DenseVLM: A Retrieval and Decoupled Alignment Framework for Open-Vocabulary Dense Prediction	Dec 9, 2024	Image Segmentationobject-detection	—Unverified	0	0

Show:10 25 50

← PrevPage 4 of 6Next →

All datasets MSCOCO LVIS v1.0 Objects365 OpenImages-v4

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Cooperative Foundational Models	AP 0.5	50.3	—	Unverified
2	DE-ViT	AP 0.5	50	—	Unverified
3	Yolov8-nano	AP 0.5	47.2	—	Unverified
4	DITO	AP 0.5	46.1	—	Unverified
5	OV-DQUO(RN50x4)	AP 0.5	45.6	—	Unverified
6	LP-OVOD (OWL-ViT Proposals)	AP 0.5	44.9	—	Unverified
7	CLIPSelf	AP 0.5	44.3	—	Unverified
8	CORA+	AP 0.5	43.1	—	Unverified
9	BARON	AP 0.5	42.7	—	Unverified
10	SIA-OVD (RN50x4)	AP 0.5	41.9	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LaMI-DETR	AP novel-LVIS base training	43.4	—	Unverified
2	DITO	AP novel-LVIS base training	40.4	—	Unverified
3	OV-DQUO(ViT-L/14)	AP novel-LVIS base training	39.3	—	Unverified
4	CoDet (EVA02-L)	AP novel-LVIS base training	37	—	Unverified
5	CLIPSelf	AP novel-LVIS base training	34.9	—	Unverified
6	OVMR	AP novel-LVIS base training	34.4	—	Unverified
7	DE-ViT	AP novel-LVIS base training	34.3	—	Unverified
8	CFM-ViT	AP novel-LVIS base training	33.9	—	Unverified
9	CLIM (RN50x64)	AP novel-LVIS base training	32.3	—	Unverified
10	RO-ViT	AP novel-LVIS base training	32.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Object-Centric-OVD	mask AP50	22.3	—	Unverified
2	ViLD	mask AP50	18.2	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Object-Centric-OVD	mask AP50	42.9	—	Unverified
2	Detic	mask AP50	42.2	—	Unverified