Open Vocabulary Object Detection

Open-vocabulary detection (OVD) aims to generalize beyond the limited number of base classes labeled during the training phase. The goal is to detect novel classes defined by an unbounded (open) vocabulary at inference.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–50 of 145 papers

Title	Date	Tasks	Status	Hype
ATAS: Any-to-Any Self-Distillation for Enhanced Open-Vocabulary Dense Prediction	Jun 10, 2025	object-detectionObject Detection	—Unverified	0
Gen-n-Val: Agentic Image Data Generation and Validation	Jun 5, 2025	Image HarmonizationInstance Segmentation	—Unverified	0
From Data to Modeling: Fully Open-vocabulary Scene Graph Generation	May 26, 2025	Graph GenerationKnowledge Distillation	—Unverified	0
FG-CLIP: Fine-Grained Visual and Textual Alignment	May 8, 2025	Image-text Retrievalobject-detection	CodeCode Available	4
VLM-R1: A Stable and Generalizable R1-style Large Vision-Language Model	Apr 10, 2025	Language ModelingLanguage Modelling	CodeCode Available	9
Superpowering Open-Vocabulary Object Detectors for X-ray Vision	Mar 21, 2025	object-detectionObject Detection	CodeCode Available	1
An Iterative Feedback Mechanism for Improving Natural Language Class Descriptions in Open-Vocabulary Object Detection	Mar 21, 2025	object-detectionObject Detection	—Unverified	0
Fine-Grained Open-Vocabulary Object Detection with Fined-Grained Prompts: Task, Dataset and Benchmark	Mar 19, 2025	Objectobject-detection	—Unverified	0
LED: LLM Enhanced Open-Vocabulary Object Detection without Human Curated Data Generation	Mar 18, 2025	DecoderObject	CodeCode Available	0
Cyclic Contrastive Knowledge Transfer for Open-Vocabulary Object Detection	Mar 14, 2025	object-detectionObject Detection	CodeCode Available	0
A Hierarchical Semantic Distillation Framework for Open-Vocabulary Object Detection	Mar 13, 2025	object-detectionObject Detection	CodeCode Available	1
DitHub: A Modular Framework for Incremental Open-Vocabulary Object Detection	Mar 12, 2025	object-detectionObject Detection	—Unverified	0
Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement	Mar 9, 2025	Domain GeneralizationObject Detection	CodeCode Available	4
OpenRSD: Towards Open-prompts for Object Detection in Remote Sensing Images	Mar 8, 2025	Objectobject-detection	—Unverified	0
Visual-RFT: Visual Reinforcement Fine-Tuning	Mar 3, 2025	Few-Shot Object DetectionFine-Grained Image Classification	CodeCode Available	7
MQADet: A Plug-and-Play Paradigm for Enhancing Open-Vocabulary Object Detection via Multimodal Question Answering	Feb 23, 2025	Objectobject-detection	—Unverified	0
Learning the RoPEs: Better 2D and 3D Position Encodings with STRING	Feb 4, 2025	object-detectionObject Detection	—Unverified	0
Modulating CNN Features with Pre-Trained ViT Representations for Open-Vocabulary Object Detection	Jan 28, 2025	object-detectionObject Detection	—Unverified	0
OW-OVD: Unified Open World and Open Vocabulary Object Detection	Jan 1, 2025	AttributeIncremental Learning	CodeCode Available	1
Open-World Objectness Modeling Unifies Novel Object Detection	Jan 1, 2025	Novel Object Detectionobject-detection	—Unverified	0
Sampling Bag of Views for Open-Vocabulary Object Detection	Dec 24, 2024	object-detectionObject Detection	—Unverified	0
Comprehensive Multi-Modal Prototypes are Simple and Effective Classifiers for Vast-Vocabulary Object Detection	Dec 23, 2024	object-detectionObject Detection	CodeCode Available	1
DenseVLM: A Retrieval and Decoupled Alignment Framework for Open-Vocabulary Dense Prediction	Dec 9, 2024	Image Segmentationobject-detection	—Unverified	0
From Open Vocabulary to Open World: Teaching Vision Language Models to Detect Novel Objects	Nov 27, 2024	Autonomous DrivingObject	CodeCode Available	1
Open Vocabulary Monocular 3D Object Detection	Nov 25, 2024	3D Object DetectionMonocular 3D Object Detection	CodeCode Available	2
Fine-Grained Open-Vocabulary Object Recognition via User-Guided Segmentation	Nov 23, 2024	Objectobject-detection	—Unverified	0
An Application-Agnostic Automatic Target Recognition System Using Vision Language Models	Nov 5, 2024	object-detectionObject Detection	—Unverified	0
Open-Vocabulary Object Detection via Language Hierarchy	Oct 27, 2024	Objectobject-detection	—Unverified	0
OVT-B: A New Large-Scale Benchmark for Open-Vocabulary Multi-Object Tracking	Oct 23, 2024	Multi-Object TrackingObject	CodeCode Available	1
Few-shot target-driven instance detection based on open-vocabulary object detection models	Oct 21, 2024	Image AugmentationObject	—Unverified	0
Open-vocabulary vs. Closed-set: Best Practice for Few-shot Object Detection Considering Text Describability	Oct 20, 2024	Few-Shot Object Detectionimage-classification	CodeCode Available	0
LUDVIG: Learning-free Uplifting of 2D Visual features to Gaussian Splatting scenes	Oct 18, 2024	3D geometryobject-detection	—Unverified	0
Boosting Open-Vocabulary Object Detection by Handling Background Samples	Oct 11, 2024	object-detectionObject Detection	—Unverified	0
VOVTrack: Exploring the Potentiality in Videos for Open-Vocabulary Object Tracking	Oct 11, 2024	Multi-Object TrackingObject	—Unverified	0
SIA-OVD: Shape-Invariant Adapter for Bridging the Image-Region Gap in Open-Vocabulary Detection	Oct 8, 2024	object-detectionObject Detection	CodeCode Available	1
Search and Detect: Training-Free Long Tail Object Detection via Web-Image Retrieval	Sep 26, 2024	Image RetrievalObject	—Unverified	0
HA-FGOVD: Highlighting Fine-grained Attributes via Explicit Linear Composition for Open-Vocabulary Object Detection	Sep 24, 2024	Attributeobject-detection	—Unverified	0
End-to-end Open-vocabulary Video Visual Relationship Detection using Multi-modal Prompting	Sep 19, 2024	DecoderObject	—Unverified	0
Mamba-YOLO-World: Marrying YOLO-World with Mamba for Open-Vocabulary Detection	Sep 13, 2024	MambaOpen Vocabulary Object Detection	CodeCode Available	2
A Lightweight Modular Framework for Low-Cost Open-Vocabulary Object Detection Training	Aug 20, 2024	Autonomous VehiclesComputational Efficiency	CodeCode Available	0
On the Potential of Open-Vocabulary Models for Object Detection in Unusual Street Scenes	Aug 20, 2024	Objectobject-detection	—Unverified	0
Locate Anything on Earth: Advancing Open-Vocabulary Object Detection for Remote Sensing Community	Aug 17, 2024	Novel ConceptsObject	CodeCode Available	3
Query3D: LLM-Powered Open-Vocabulary Scene Segmentation with Language Embedded 3D Gaussian	Aug 7, 2024	Autonomous Drivingobject-detection	CodeCode Available	1
MarvelOVD: Marrying Object Recognition and Vision-Language Models for Robust Open-Vocabulary Object Detection	Jul 31, 2024	Language ModellingObject	CodeCode Available	1
LaMI-DETR: Open-Vocabulary Detection with Language Model Instruction	Jul 16, 2024	Language ModelingLanguage Modelling	CodeCode Available	2
Unconstrained Open Vocabulary Image Classification: Zero-Shot Transfer from Text to Image via CLIP Inversion	Jul 15, 2024	image-classificationImage Classification	CodeCode Available	0
OVLW-DETR: Open-Vocabulary Light-Weighted Detection Transformer	Jul 15, 2024	Language ModelingLanguage Modelling	CodeCode Available	3
DART: An Automated End-to-End Object Detection Pipeline with Data Diversification, Open-Vocabulary Bounding Box Annotation, Pseudo-Label Review, and Model Training	Jul 12, 2024	Image GenerationObject	CodeCode Available	1
BACON: Improving Clarity of Image Captions via Bag-of-Concept Graphs	Jul 3, 2024	Image CaptioningImage Generation	—Unverified	0
V3Det Challenge 2024 on Vast Vocabulary and Open Vocabulary Object Detection: Methods and Results	Jun 17, 2024	Objectobject-detection	—Unverified	0

Show:10 25 50

← PrevPage 1 of 3Next →

All datasets MSCOCO LVIS v1.0 Objects365 OpenImages-v4

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Cooperative Foundational Models	AP 0.5	50.3	—	Unverified
2	DE-ViT	AP 0.5	50	—	Unverified
3	Yolov8-nano	AP 0.5	47.2	—	Unverified
4	DITO	AP 0.5	46.1	—	Unverified
5	OV-DQUO(RN50x4)	AP 0.5	45.6	—	Unverified
6	LP-OVOD (OWL-ViT Proposals)	AP 0.5	44.9	—	Unverified
7	CLIPSelf	AP 0.5	44.3	—	Unverified
8	CORA+	AP 0.5	43.1	—	Unverified
9	BARON	AP 0.5	42.7	—	Unverified
10	SIA-OVD (RN50x4)	AP 0.5	41.9	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LaMI-DETR	AP novel-LVIS base training	43.4	—	Unverified
2	DITO	AP novel-LVIS base training	40.4	—	Unverified
3	OV-DQUO(ViT-L/14)	AP novel-LVIS base training	39.3	—	Unverified
4	CoDet (EVA02-L)	AP novel-LVIS base training	37	—	Unverified
5	CLIPSelf	AP novel-LVIS base training	34.9	—	Unverified
6	OVMR	AP novel-LVIS base training	34.4	—	Unverified
7	DE-ViT	AP novel-LVIS base training	34.3	—	Unverified
8	CFM-ViT	AP novel-LVIS base training	33.9	—	Unverified
9	CLIM (RN50x64)	AP novel-LVIS base training	32.3	—	Unverified
10	RO-ViT	AP novel-LVIS base training	32.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Object-Centric-OVD	mask AP50	22.3	—	Unverified
2	ViLD	mask AP50	18.2	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Object-Centric-OVD	mask AP50	42.9	—	Unverified
2	Detic	mask AP50	42.2	—	Unverified