SOTAVerified

Open Vocabulary Object Detection

Open-vocabulary detection (OVD) aims to generalize beyond the limited number of base classes labeled during the training phase. The goal is to detect novel classes defined by an unbounded (open) vocabulary at inference.

Papers

Showing 150 of 145 papers

TitleStatusHype
ATAS: Any-to-Any Self-Distillation for Enhanced Open-Vocabulary Dense Prediction0
Gen-n-Val: Agentic Image Data Generation and Validation0
From Data to Modeling: Fully Open-vocabulary Scene Graph Generation0
FG-CLIP: Fine-Grained Visual and Textual AlignmentCode4
VLM-R1: A Stable and Generalizable R1-style Large Vision-Language ModelCode9
Superpowering Open-Vocabulary Object Detectors for X-ray VisionCode1
An Iterative Feedback Mechanism for Improving Natural Language Class Descriptions in Open-Vocabulary Object Detection0
Fine-Grained Open-Vocabulary Object Detection with Fined-Grained Prompts: Task, Dataset and Benchmark0
LED: LLM Enhanced Open-Vocabulary Object Detection without Human Curated Data GenerationCode0
Cyclic Contrastive Knowledge Transfer for Open-Vocabulary Object DetectionCode0
A Hierarchical Semantic Distillation Framework for Open-Vocabulary Object DetectionCode1
DitHub: A Modular Framework for Incremental Open-Vocabulary Object Detection0
Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive ReinforcementCode4
OpenRSD: Towards Open-prompts for Object Detection in Remote Sensing Images0
Visual-RFT: Visual Reinforcement Fine-TuningCode7
MQADet: A Plug-and-Play Paradigm for Enhancing Open-Vocabulary Object Detection via Multimodal Question Answering0
Learning the RoPEs: Better 2D and 3D Position Encodings with STRING0
Modulating CNN Features with Pre-Trained ViT Representations for Open-Vocabulary Object Detection0
OW-OVD: Unified Open World and Open Vocabulary Object DetectionCode1
Open-World Objectness Modeling Unifies Novel Object Detection0
Sampling Bag of Views for Open-Vocabulary Object Detection0
Comprehensive Multi-Modal Prototypes are Simple and Effective Classifiers for Vast-Vocabulary Object DetectionCode1
DenseVLM: A Retrieval and Decoupled Alignment Framework for Open-Vocabulary Dense Prediction0
From Open Vocabulary to Open World: Teaching Vision Language Models to Detect Novel ObjectsCode1
Open Vocabulary Monocular 3D Object DetectionCode2
Fine-Grained Open-Vocabulary Object Recognition via User-Guided Segmentation0
An Application-Agnostic Automatic Target Recognition System Using Vision Language Models0
Open-Vocabulary Object Detection via Language Hierarchy0
OVT-B: A New Large-Scale Benchmark for Open-Vocabulary Multi-Object TrackingCode1
Few-shot target-driven instance detection based on open-vocabulary object detection models0
Open-vocabulary vs. Closed-set: Best Practice for Few-shot Object Detection Considering Text DescribabilityCode0
LUDVIG: Learning-free Uplifting of 2D Visual features to Gaussian Splatting scenes0
Boosting Open-Vocabulary Object Detection by Handling Background Samples0
VOVTrack: Exploring the Potentiality in Videos for Open-Vocabulary Object Tracking0
SIA-OVD: Shape-Invariant Adapter for Bridging the Image-Region Gap in Open-Vocabulary DetectionCode1
Search and Detect: Training-Free Long Tail Object Detection via Web-Image Retrieval0
HA-FGOVD: Highlighting Fine-grained Attributes via Explicit Linear Composition for Open-Vocabulary Object Detection0
End-to-end Open-vocabulary Video Visual Relationship Detection using Multi-modal Prompting0
Mamba-YOLO-World: Marrying YOLO-World with Mamba for Open-Vocabulary DetectionCode2
A Lightweight Modular Framework for Low-Cost Open-Vocabulary Object Detection TrainingCode0
On the Potential of Open-Vocabulary Models for Object Detection in Unusual Street Scenes0
Locate Anything on Earth: Advancing Open-Vocabulary Object Detection for Remote Sensing CommunityCode3
Query3D: LLM-Powered Open-Vocabulary Scene Segmentation with Language Embedded 3D GaussianCode1
MarvelOVD: Marrying Object Recognition and Vision-Language Models for Robust Open-Vocabulary Object DetectionCode1
LaMI-DETR: Open-Vocabulary Detection with Language Model InstructionCode2
Unconstrained Open Vocabulary Image Classification: Zero-Shot Transfer from Text to Image via CLIP InversionCode0
OVLW-DETR: Open-Vocabulary Light-Weighted Detection TransformerCode3
DART: An Automated End-to-End Object Detection Pipeline with Data Diversification, Open-Vocabulary Bounding Box Annotation, Pseudo-Label Review, and Model TrainingCode1
BACON: Improving Clarity of Image Captions via Bag-of-Concept Graphs0
V3Det Challenge 2024 on Vast Vocabulary and Open Vocabulary Object Detection: Methods and Results0
Show:102550
← PrevPage 1 of 3Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1Cooperative Foundational ModelsAP 0.550.3Unverified
2DE-ViTAP 0.550Unverified
3Yolov8-nanoAP 0.547.2Unverified
4DITOAP 0.546.1Unverified
5OV-DQUO(RN50x4)AP 0.545.6Unverified
6LP-OVOD (OWL-ViT Proposals)AP 0.544.9Unverified
7CLIPSelfAP 0.544.3Unverified
8CORA+AP 0.543.1Unverified
9BARONAP 0.542.7Unverified
10SIA-OVD (RN50x4)AP 0.541.9Unverified
#ModelMetricClaimedVerifiedStatus
1LaMI-DETRAP novel-LVIS base training43.4Unverified
2DITOAP novel-LVIS base training40.4Unverified
3OV-DQUO(ViT-L/14)AP novel-LVIS base training39.3Unverified
4CoDet (EVA02-L)AP novel-LVIS base training37Unverified
5CLIPSelfAP novel-LVIS base training34.9Unverified
6OVMRAP novel-LVIS base training34.4Unverified
7DE-ViTAP novel-LVIS base training34.3Unverified
8CFM-ViTAP novel-LVIS base training33.9Unverified
9CLIM (RN50x64)AP novel-LVIS base training32.3Unverified
10RO-ViTAP novel-LVIS base training32.1Unverified
#ModelMetricClaimedVerifiedStatus
1Object-Centric-OVDmask AP5022.3Unverified
2ViLDmask AP5018.2Unverified
#ModelMetricClaimedVerifiedStatus
1Object-Centric-OVDmask AP5042.9Unverified
2Deticmask AP5042.2Unverified