SOTAVerified

Open Vocabulary Object Detection

Open-vocabulary detection (OVD) aims to generalize beyond the limited number of base classes labeled during the training phase. The goal is to detect novel classes defined by an unbounded (open) vocabulary at inference.

Papers

Showing 101125 of 145 papers

TitleStatusHype
OpenIns3D: Snap and Lookup for 3D Open-vocabulary Instance SegmentationCode2
Exploring Multi-Modal Contextual Knowledge for Open-Vocabulary Object Detection0
Taming Self-Training for Open-Vocabulary Object DetectionCode1
Described Object Detection: Liberating Object Detection with Flexible ExpressionsCode1
Open-Vocabulary Object Detection via Scene Graph Discovery0
Scaling Open-Vocabulary Object DetectionCode0
Multi-Modal Classifiers for Open-Vocabulary Object DetectionCode1
Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision TransformersCode1
DetCLIPv2: Scalable Open-Vocabulary Object Detection Pre-training via Word-Region Alignment0
V3Det: Vast Vocabulary Visual Detection DatasetCode1
MaMMUT: A Simple Architecture for Joint Learning for MultiModal TasksCode0
Prompt-Guided Transformers for End-to-End Open-Vocabulary Object Detection0
CORA: Adapting CLIP for Open-Vocabulary Detection with Region Prompting and Anchor Pre-MatchingCode1
Open-Vocabulary Object Detection using Pseudo Caption Labels0
Investigating the Role of Attribute Context in Vision-Language Models for Object Recognition and Detection0
Object-Aware Distillation Pyramid for Open-Vocabulary Object DetectionCode1
Aligning Bag of Regions for Open-Vocabulary Object DetectionCode1
OvarNet: Towards Open-vocabulary Object Attribute RecognitionCode1
Open-Vocabulary Object Detection With an Open Corpus0
Distilling DETR with Visual-Linguistic Knowledge for Open-Vocabulary Object DetectionCode1
Learning To Generate Language-Supervised and Open-Vocabulary Scene Graph Using Pre-Trained Visual-Semantic SpaceCode1
Learning to Detect and Segment for Open Vocabulary Object Detection0
X-Paste: Revisiting Scalable Copy-Paste for Instance Segmentation using CLIP and StableDiffusionCode1
Learning Object-Language Alignments for Open-Vocabulary Object DetectionCode1
Open-vocabulary Attribute DetectionCode1
Show:102550
← PrevPage 5 of 6Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Cooperative Foundational ModelsAP 0.550.3Unverified
2DE-ViTAP 0.550Unverified
3Yolov8-nanoAP 0.547.2Unverified
4DITOAP 0.546.1Unverified
5OV-DQUO(RN50x4)AP 0.545.6Unverified
6LP-OVOD (OWL-ViT Proposals)AP 0.544.9Unverified
7CLIPSelfAP 0.544.3Unverified
8CORA+AP 0.543.1Unverified
9BARONAP 0.542.7Unverified
10SIA-OVD (RN50x4)AP 0.541.9Unverified
#ModelMetricClaimedVerifiedStatus
1LaMI-DETRAP novel-LVIS base training43.4Unverified
2DITOAP novel-LVIS base training40.4Unverified
3OV-DQUO(ViT-L/14)AP novel-LVIS base training39.3Unverified
4CoDet (EVA02-L)AP novel-LVIS base training37Unverified
5CLIPSelfAP novel-LVIS base training34.9Unverified
6OVMRAP novel-LVIS base training34.4Unverified
7DE-ViTAP novel-LVIS base training34.3Unverified
8CFM-ViTAP novel-LVIS base training33.9Unverified
9CLIM (RN50x64)AP novel-LVIS base training32.3Unverified
10RO-ViTAP novel-LVIS base training32.1Unverified
#ModelMetricClaimedVerifiedStatus
1Object-Centric-OVDmask AP5022.3Unverified
2ViLDmask AP5018.2Unverified
#ModelMetricClaimedVerifiedStatus
1Object-Centric-OVDmask AP5042.9Unverified
2Deticmask AP5042.2Unverified