Open Vocabulary Attribute Detection

Open-Vocabulary Attribute Detection (OVAD) is a task that aims to detect and recognize an open set of objects and their associated attributes in an image. The objects and attributes are defined by text queries during inference, without prior knowledge of the tested classes during training.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–14 of 14 papers

Title	Date	Tasks	Status	Hype
Compositional Caching for Training-free Open-vocabulary Attribute Detection	Jan 1, 2025	AttributeOpen Vocabulary Attribute Detection	—Unverified	0
Open-ended VQA benchmarking of Vision-Language models by exploiting Classification datasets and their semantic hierarchy	Feb 11, 2024	Language ModelingOpen Vocabulary Attribute Detection	CodeCode Available	1
LOWA: Localize Objects in the Wild with Attributes	May 31, 2023	AttributeObject	—Unverified	0
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models	Jan 30, 2023	Generative Visual Question AnsweringImage Captioning	CodeCode Available	4
OvarNet: Towards Open-vocabulary Object Attribute Recognition	Jan 23, 2023	AttributeKnowledge Distillation	CodeCode Available	1
Reproducible scaling laws for contrastive language-image learning	Dec 14, 2022	Image ClassificationOpen Vocabulary Attribute Detection	CodeCode Available	1
Open-vocabulary Attribute Detection	Nov 23, 2022	AttributeLanguage Modeling	CodeCode Available	1
Bridging the Gap between Object and Image-level Representations for Open-Vocabulary Detection	Jul 7, 2022	ObjectOpen Vocabulary Attribute Detection	CodeCode Available	2
Localized Vision-Language Matching for Open-vocabulary Object Detection	May 12, 2022	Language ModelingLanguage Modelling	CodeCode Available	1
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation	Jan 28, 2022	Image CaptioningImage-text matching	CodeCode Available	5
Multi-Grained Vision Language Pre-Training: Aligning Texts with Visual Concepts	Nov 16, 2021	Cross-Modal RetrievalImage Captioning	CodeCode Available	1
Align before Fuse: Vision and Language Representation Learning with Momentum Distillation	Jul 16, 2021	Cross-Modal RetrievalGrounded language learning	CodeCode Available	1
Learning Transferable Visual Models From Natural Language Supervision	Feb 26, 2021	Action RecognitionBenchmarking	CodeCode Available	2
Open-Vocabulary Object Detection Using Captions	Nov 20, 2020	Objectobject-detection	CodeCode Available	1

Show:10 25 50

No leaderboard results yet.