SOTAVerified

Open Vocabulary Object Detection

Open-vocabulary detection (OVD) aims to generalize beyond the limited number of base classes labeled during the training phase. The goal is to detect novel classes defined by an unbounded (open) vocabulary at inference.

Papers

Showing 5175 of 145 papers

TitleStatusHype
PointCLIP: Point Cloud Understanding by CLIPCode1
ProxyDet: Synthesizing Proxy Novel Classes via Classwise Mixup for Open-Vocabulary Object DetectionCode1
Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision TransformersCode1
RegionCLIP: Region-based Language-Image PretrainingCode1
Retrieval-Augmented Open-Vocabulary Object DetectionCode1
RTGen: Generating Region-Text Pairs for Open-Vocabulary Object DetectionCode1
CLIM: Contrastive Language-Image Mosaic for Region RepresentationCode1
SIA-OVD: Shape-Invariant Adapter for Bridging the Image-Region Gap in Open-Vocabulary DetectionCode1
Simple Image-level Classification Improves Open-vocabulary Object DetectionCode1
Superpowering Open-Vocabulary Object Detectors for X-ray VisionCode1
The devil is in the fine-grained details: Evaluating open-vocabulary object detectors for fine-grained understandingCode1
The devil is in the object boundary: towards annotation-free instance segmentation using Foundation ModelsCode1
Open Vocabulary Object Detection with Pseudo Bounding-Box LabelsCode1
Training-free Boost for Open-Vocabulary Object Detection with Confidence AggregationCode1
MoCaE: Mixture of Calibrated Experts Significantly Improves Object DetectionCode1
Enhancing Novel Object Detection via Cooperative Foundational ModelsCode1
CORA: Adapting CLIP for Open-Vocabulary Detection with Region Prompting and Anchor Pre-MatchingCode1
Multi-Modal Classifiers for Open-Vocabulary Object DetectionCode1
Object-Aware Distillation Pyramid for Open-Vocabulary Object DetectionCode1
Expanding Scene Graph Boundaries: Fully Open-vocabulary Scene Graph Generation via Visual-Concept Alignment and RetentionCode1
Exploiting Unlabeled Data with Vision and Language Models for Object DetectionCode1
Open-vocabulary Attribute DetectionCode1
Open-Vocabulary Object Detection Using CaptionsCode1
DART: An Automated End-to-End Object Detection Pipeline with Data Diversification, Open-Vocabulary Bounding Box Annotation, Pseudo-Label Review, and Model TrainingCode1
A Hierarchical Semantic Distillation Framework for Open-Vocabulary Object DetectionCode1
Show:102550
← PrevPage 3 of 6Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Cooperative Foundational ModelsAP 0.550.3Unverified
2DE-ViTAP 0.550Unverified
3Yolov8-nanoAP 0.547.2Unverified
4DITOAP 0.546.1Unverified
5OV-DQUO(RN50x4)AP 0.545.6Unverified
6LP-OVOD (OWL-ViT Proposals)AP 0.544.9Unverified
7CLIPSelfAP 0.544.3Unverified
8CORA+AP 0.543.1Unverified
9BARONAP 0.542.7Unverified
10SIA-OVD (RN50x4)AP 0.541.9Unverified
#ModelMetricClaimedVerifiedStatus
1LaMI-DETRAP novel-LVIS base training43.4Unverified
2DITOAP novel-LVIS base training40.4Unverified
3OV-DQUO(ViT-L/14)AP novel-LVIS base training39.3Unverified
4CoDet (EVA02-L)AP novel-LVIS base training37Unverified
5CLIPSelfAP novel-LVIS base training34.9Unverified
6OVMRAP novel-LVIS base training34.4Unverified
7DE-ViTAP novel-LVIS base training34.3Unverified
8CFM-ViTAP novel-LVIS base training33.9Unverified
9CLIM (RN50x64)AP novel-LVIS base training32.3Unverified
10RO-ViTAP novel-LVIS base training32.1Unverified
#ModelMetricClaimedVerifiedStatus
1Object-Centric-OVDmask AP5022.3Unverified
2ViLDmask AP5018.2Unverified
#ModelMetricClaimedVerifiedStatus
1Object-Centric-OVDmask AP5042.9Unverified
2Deticmask AP5042.2Unverified