Open Vocabulary Object Detection

Open-vocabulary detection (OVD) aims to generalize beyond the limited number of base classes labeled during the training phase. The goal is to detect novel classes defined by an unbounded (open) vocabulary at inference.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 51–100 of 145 papers

Title	Date	Tasks	Status	Hype
Open-Vocabulary X-ray Prohibited Item Detection via Fine-tuning CLIP	Jun 16, 2024	object-detectionObject Detection	—Unverified	0
Enhanced Object Detection: A Study on Vast Vocabulary Object Detection Track for V3Det Challenge 2024	Jun 13, 2024	Objectobject-detection	—Unverified	0
OVMR: Open-Vocabulary Recognition with Multi-Modal References	Jun 7, 2024	Open Vocabulary Object Detection	CodeCode Available	1
Learning Background Prompts to Discover Implicit Knowledge for Open Vocabulary Object Detection	Jun 1, 2024	Knowledge DistillationObject	—Unverified	0
RTGen: Generating Region-Text Pairs for Open-Vocabulary Object Detection	May 30, 2024	Image CaptioningImage Inpainting	CodeCode Available	1
OV-DQUO: Open-Vocabulary DETR with Denoising Text Query Training and Open-World Unknown Objects Supervision	May 28, 2024	Contrastive LearningDenoising	CodeCode Available	1
SHiNe: Semantic Hierarchy Nexus for Open-vocabulary Object Detection	May 16, 2024	object-detectionObject Detection	CodeCode Available	2
Open-Vocabulary Object Detection via Neighboring Region Attention Alignment	May 14, 2024	Diversityobject-detection	—Unverified	0
The devil is in the object boundary: towards annotation-free instance segmentation using Foundation Models	Apr 18, 2024	Instance SegmentationObject	CodeCode Available	1
Watch Your Step: Optimal Retrieval for Continual Learning at Scale	Apr 16, 2024	Continual Learningobject-detection	—Unverified	0
DetCLIPv3: Towards Versatile Generative Open-vocabulary Object Detection	Apr 14, 2024	Dense CaptioningLanguage Modelling	—Unverified	0
Training-free Boost for Open-Vocabulary Object Detection with Confidence Aggregation	Apr 12, 2024	Objectobject-detection	CodeCode Available	1
Retrieval-Augmented Open-Vocabulary Object Detection	Apr 8, 2024	Language ModelingLanguage Modelling	CodeCode Available	1
Is CLIP the main roadblock for fine-grained open-world perception?	Apr 4, 2024	Autonomous DrivingNovel Concepts	CodeCode Available	2
Open-Vocabulary Object Detectors: Robustness Challenges under Distribution Shifts	Apr 1, 2024	Objectobject-detection	—Unverified	0
VisionGPT: LLM-Assisted Real-Time Anomaly Detection for Safe Visual Navigation	Mar 19, 2024	Anomaly Detectionobject-detection	CodeCode Available	1
Generative Region-Language Pretraining for Open-Ended Object Detection	Mar 15, 2024	Language ModelingLanguage Modelling	CodeCode Available	2
Open-Vocabulary Object Detection with Meta Prompt Representation and Instance Contrastive Optimization	Mar 14, 2024	Contrastive LearningKnowledge Distillation	—Unverified	0
Real-time Transformer-based Open-Vocabulary Detection with Efficient Fusion Head	Mar 11, 2024	Object DetectionOpen-vocabulary object detection	CodeCode Available	5
YOLOv8-AM: YOLOv8 Based on Effective Attention Mechanisms for Pediatric Wrist Fracture Detection	Feb 14, 2024	Fracture detectionmedical image detection	CodeCode Available	2
LLMs Meet VLMs: Boost Open Vocabulary Object Detection with Fine-grained Descriptors	Feb 7, 2024	image-classificationImage Classification	—Unverified	0
Cross-Domain Few-Shot Object Detection via Enhanced Open-Set Object Detector	Feb 5, 2024	Cross-Domain Few-ShotCross-Domain Few-Shot Object Detection	CodeCode Available	2
YOLO-World: Real-Time Open-Vocabulary Object Detection	Jan 30, 2024	Instance SegmentationLanguage Modeling	CodeCode Available	9
LCV2: An Efficient Pretraining-Free Framework for Grounded Visual Question Answering	Jan 29, 2024	Language ModelingLanguage Modelling	—Unverified	0
Scene-adaptive and Region-aware Multi-modal Prompt for Open Vocabulary Object Detection	Jan 1, 2024	Knowledge Distillationobject-detection	—Unverified	0
Exploring Region-Word Alignment in Built-in Detector for Open-Vocabulary Object Detection	Jan 1, 2024	Decoderobject-detection	—Unverified	0
Generating Enhanced Negatives for Training Language-Based Object Detectors	Dec 29, 2023	Objectobject-detection	CodeCode Available	0
GroundVLP: Harnessing Zero-shot Visual Grounding from Vision-Language Pre-training and Open-Vocabulary Object Detection	Dec 22, 2023	Attributeobject-detection	CodeCode Available	1
Weakly Supervised Open-Vocabulary Object Detection	Dec 19, 2023	AttributeNovel Concepts	—Unverified	0
CLIM: Contrastive Language-Image Mosaic for Region Representation	Dec 18, 2023	Objectobject-detection	CodeCode Available	1
Simple Image-level Classification Improves Open-vocabulary Object Detection	Dec 16, 2023	Knowledge DistillationObject	CodeCode Available	1
ProxyDet: Synthesizing Proxy Novel Classes via Classwise Mixup for Open-Vocabulary Object Detection	Dec 12, 2023	object-detectionObject Detection	CodeCode Available	1
Learning Pseudo-Labeler beyond Noun Concepts for Open-Vocabulary Object Detection	Dec 4, 2023	Image to textobject-detection	—Unverified	0
The devil is in the fine-grained details: Evaluating open-vocabulary object detectors for fine-grained understanding	Nov 29, 2023	Objectobject-detection	CodeCode Available	1
Toward Open Vocabulary Aerial Object Detection with CLIP-Activated Student-Teacher Learning	Nov 20, 2023	Objectobject-detection	CodeCode Available	1
Enhancing Novel Object Detection via Cooperative Foundational Models	Nov 19, 2023	Novel Class DiscoveryNovel Object Detection	CodeCode Available	1
Expanding Scene Graph Boundaries: Fully Open-vocabulary Scene Graph Generation via Visual-Concept Alignment and Retention	Nov 18, 2023	Concept AlignmentGraph Generation	CodeCode Available	1
Meta-Adapter: An Online Few-shot Learner for Vision-Language Model	Nov 7, 2023	Few-Shot Learningimage-classification	CodeCode Available	1
Spuriosity Rankings for Free: A Simple Framework for Last Layer Retraining Based on Object Detection	Oct 31, 2023	Objectobject-detection	—Unverified	0
YOLOv8-Based Visual Detection of Road Hazards: Potholes, Sewer Covers, and Manholes	Oct 31, 2023	Computational Efficiencyobject-detection	—Unverified	0
LP-OVOD: Open-Vocabulary Object Detection by Linear Probing	Oct 26, 2023	Objectobject-detection	CodeCode Available	1
CoDet: Co-Occurrence Guided Region-Word Alignment for Open-Vocabulary Object Detection	Oct 25, 2023	Objectobject-detection	CodeCode Available	1
OV-VG: A Benchmark for Open-Vocabulary Visual Grounding	Oct 22, 2023	Novel Conceptsobject-detection	CodeCode Available	1
CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction	Oct 2, 2023	image-classificationImage Classification	CodeCode Available	2
DST-Det: Simple Dynamic Self-Training for Open-Vocabulary Object Detection	Oct 2, 2023	Novel Object DetectionObject	CodeCode Available	1
Region-centric Image-Language Pretraining for Open-Vocabulary Detection	Sep 29, 2023	Contrastive LearningObject	CodeCode Available	0
MoCaE: Mixture of Calibrated Experts Significantly Improves Object Detection	Sep 26, 2023	Instance SegmentationMixture-of-Experts	CodeCode Available	1
Detect Everything with Few Examples	Sep 22, 2023	Binary ClassificationCross-Domain Few-Shot Object Detection	CodeCode Available	2
EdaDet: Open-Vocabulary Object Detection Using Early Dense Alignment	Sep 3, 2023	Objectobject-detection	—Unverified	0
Contrastive Feature Masking Open-Vocabulary Vision Transformer	Sep 2, 2023	Contrastive LearningImage-text Retrieval	—Unverified	0

Show:10 25 50

← PrevPage 2 of 3Next →

All datasets MSCOCO LVIS v1.0 Objects365 OpenImages-v4

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	Cooperative Foundational Models	AP 0.5	50.3	—	Unverified
2	DE-ViT	AP 0.5	50	—	Unverified
3	Yolov8-nano	AP 0.5	47.2	—	Unverified
4	DITO	AP 0.5	46.1	—	Unverified
5	OV-DQUO(RN50x4)	AP 0.5	45.6	—	Unverified
6	LP-OVOD (OWL-ViT Proposals)	AP 0.5	44.9	—	Unverified
7	CLIPSelf	AP 0.5	44.3	—	Unverified
8	CORA+	AP 0.5	43.1	—	Unverified
9	BARON	AP 0.5	42.7	—	Unverified
10	SIA-OVD (RN50x4)	AP 0.5	41.9	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	LaMI-DETR	AP novel-LVIS base training	43.4	—	Unverified
2	DITO	AP novel-LVIS base training	40.4	—	Unverified
3	OV-DQUO(ViT-L/14)	AP novel-LVIS base training	39.3	—	Unverified
4	CoDet (EVA02-L)	AP novel-LVIS base training	37	—	Unverified
5	CLIPSelf	AP novel-LVIS base training	34.9	—	Unverified
6	OVMR	AP novel-LVIS base training	34.4	—	Unverified
7	DE-ViT	AP novel-LVIS base training	34.3	—	Unverified
8	CFM-ViT	AP novel-LVIS base training	33.9	—	Unverified
9	CLIM (RN50x64)	AP novel-LVIS base training	32.3	—	Unverified
10	RO-ViT	AP novel-LVIS base training	32.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Object-Centric-OVD	mask AP50	22.3	—	Unverified
2	ViLD	mask AP50	18.2	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Object-Centric-OVD	mask AP50	42.9	—	Unverified
2	Detic	mask AP50	42.2	—	Unverified