SOTAVerified

Open Vocabulary Attribute Detection

Open-Vocabulary Attribute Detection (OVAD) is a task that aims to detect and recognize an open set of objects and their associated attributes in an image. The objects and attributes are defined by text queries during inference, without prior knowledge of the tested classes during training.

Papers

Showing 114 of 14 papers

TitleStatusHype
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and GenerationCode5
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language ModelsCode4
Learning Transferable Visual Models From Natural Language SupervisionCode2
Bridging the Gap between Object and Image-level Representations for Open-Vocabulary DetectionCode2
Align before Fuse: Vision and Language Representation Learning with Momentum DistillationCode1
OvarNet: Towards Open-vocabulary Object Attribute RecognitionCode1
Reproducible scaling laws for contrastive language-image learningCode1
Localized Vision-Language Matching for Open-vocabulary Object DetectionCode1
Multi-Grained Vision Language Pre-Training: Aligning Texts with Visual ConceptsCode1
Open-ended VQA benchmarking of Vision-Language models by exploiting Classification datasets and their semantic hierarchyCode1
Open-vocabulary Attribute DetectionCode1
Open-Vocabulary Object Detection Using CaptionsCode1
Compositional Caching for Training-free Open-vocabulary Attribute Detection0
LOWA: Localize Objects in the Wild with Attributes0
Show:102550

No leaderboard results yet.