SOTAVerified

Open Vocabulary Object Detection

Open-vocabulary detection (OVD) aims to generalize beyond the limited number of base classes labeled during the training phase. The goal is to detect novel classes defined by an unbounded (open) vocabulary at inference.

Papers

Showing 76100 of 145 papers

TitleStatusHype
Comprehensive Multi-Modal Prototypes are Simple and Effective Classifiers for Vast-Vocabulary Object DetectionCode1
Localized Vision-Language Matching for Open-vocabulary Object DetectionCode1
LP-OVOD: Open-Vocabulary Object Detection by Linear ProbingCode1
Aligning Bag of Regions for Open-Vocabulary Object DetectionCode1
MarvelOVD: Marrying Object Recognition and Vision-Language Models for Robust Open-Vocabulary Object DetectionCode1
Meta-Adapter: An Online Few-shot Learner for Vision-Language ModelCode1
A Lightweight Modular Framework for Low-Cost Open-Vocabulary Object Detection TrainingCode0
F-VLM: Open-Vocabulary Object Detection upon Frozen Vision and Language ModelsCode0
Cyclic Contrastive Knowledge Transfer for Open-Vocabulary Object DetectionCode0
LED: LLM Enhanced Open-Vocabulary Object Detection without Human Curated Data GenerationCode0
Scaling Open-Vocabulary Object DetectionCode0
Generating Enhanced Negatives for Training Language-Based Object DetectorsCode0
Region-centric Image-Language Pretraining for Open-Vocabulary DetectionCode0
MaMMUT: A Simple Architecture for Joint Learning for MultiModal TasksCode0
Unconstrained Open Vocabulary Image Classification: Zero-Shot Transfer from Text to Image via CLIP InversionCode0
Open-vocabulary vs. Closed-set: Best Practice for Few-shot Object Detection Considering Text DescribabilityCode0
Simple Open-Vocabulary Object Detection with Vision TransformersCode0
Open-Vocabulary Object Detection via Scene Graph Discovery0
An Application-Agnostic Automatic Target Recognition System Using Vision Language Models0
An Iterative Feedback Mechanism for Improving Natural Language Class Descriptions in Open-Vocabulary Object Detection0
ATAS: Any-to-Any Self-Distillation for Enhanced Open-Vocabulary Dense Prediction0
BACON: Improving Clarity of Image Captions via Bag-of-Concept Graphs0
Boosting Open-Vocabulary Object Detection by Handling Background Samples0
Contrastive Feature Masking Open-Vocabulary Vision Transformer0
DenseVLM: A Retrieval and Decoupled Alignment Framework for Open-Vocabulary Dense Prediction0
Show:102550
← PrevPage 4 of 6Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Cooperative Foundational ModelsAP 0.550.3Unverified
2DE-ViTAP 0.550Unverified
3Yolov8-nanoAP 0.547.2Unverified
4DITOAP 0.546.1Unverified
5OV-DQUO(RN50x4)AP 0.545.6Unverified
6LP-OVOD (OWL-ViT Proposals)AP 0.544.9Unverified
7CLIPSelfAP 0.544.3Unverified
8CORA+AP 0.543.1Unverified
9BARONAP 0.542.7Unverified
10SIA-OVD (RN50x4)AP 0.541.9Unverified
#ModelMetricClaimedVerifiedStatus
1LaMI-DETRAP novel-LVIS base training43.4Unverified
2DITOAP novel-LVIS base training40.4Unverified
3OV-DQUO(ViT-L/14)AP novel-LVIS base training39.3Unverified
4CoDet (EVA02-L)AP novel-LVIS base training37Unverified
5CLIPSelfAP novel-LVIS base training34.9Unverified
6OVMRAP novel-LVIS base training34.4Unverified
7DE-ViTAP novel-LVIS base training34.3Unverified
8CFM-ViTAP novel-LVIS base training33.9Unverified
9CLIM (RN50x64)AP novel-LVIS base training32.3Unverified
10RO-ViTAP novel-LVIS base training32.1Unverified
#ModelMetricClaimedVerifiedStatus
1Object-Centric-OVDmask AP5022.3Unverified
2ViLDmask AP5018.2Unverified
#ModelMetricClaimedVerifiedStatus
1Object-Centric-OVDmask AP5042.9Unverified
2Deticmask AP5042.2Unverified