SOTAVerified

Described Object Detection

Described Object Detection (DOD) detects all instances on each image in the dataset, based on a flexible reference. It is a superset of Open-Vocabulary Object Detection (OVD) and Referring Expression Comprehension (REC). It expands category names to flexible language expressions for OVD and overcomes the limitation of REC only grounding the pre-existing object. Works related to DOD are tracked in awesome-DOD list on github.

Papers

Showing 18 of 8 papers

TitleStatusHype
An Open and Comprehensive Pipeline for Unified Object Grounding and DetectionCode1
SPHINX: The Joint Mixing of Weights, Tasks, and Visual Embeddings for Multi-modal Large Language ModelsCode4
Described Object Detection: Liberating Object Detection with Flexible ExpressionsCode1
CORA: Adapting CLIP for Open-Vocabulary Detection with Region Prompting and Anchor Pre-MatchingCode1
Universal Instance Perception as Object Discovery and RetrievalCode3
Coarse-to-Fine Vision-Language Pre-training with Fusion in the BackboneCode1
Simple Open-Vocabulary Object Detection with Vision TransformersCode0
Grounded Language-Image Pre-trainingCode2
Show:102550

No leaderboard results yet.