SOTAVerified

Open Vocabulary Attribute Detection

Open-Vocabulary Attribute Detection (OVAD) is a task that aims to detect and recognize an open set of objects and their associated attributes in an image. The objects and attributes are defined by text queries during inference, without prior knowledge of the tested classes during training.

Papers

Showing 114 of 14 papers

TitleStatusHype
Compositional Caching for Training-free Open-vocabulary Attribute Detection0
Open-ended VQA benchmarking of Vision-Language models by exploiting Classification datasets and their semantic hierarchyCode1
LOWA: Localize Objects in the Wild with Attributes0
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language ModelsCode4
OvarNet: Towards Open-vocabulary Object Attribute RecognitionCode1
Reproducible scaling laws for contrastive language-image learningCode1
Open-vocabulary Attribute DetectionCode1
Bridging the Gap between Object and Image-level Representations for Open-Vocabulary DetectionCode2
Localized Vision-Language Matching for Open-vocabulary Object DetectionCode1
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and GenerationCode5
Multi-Grained Vision Language Pre-Training: Aligning Texts with Visual ConceptsCode1
Align before Fuse: Vision and Language Representation Learning with Momentum DistillationCode1
Learning Transferable Visual Models From Natural Language SupervisionCode2
Open-Vocabulary Object Detection Using CaptionsCode1
Show:102550

No leaderboard results yet.