SOTAVerified

Open Vocabulary Attribute Detection

Open-Vocabulary Attribute Detection (OVAD) is a task that aims to detect and recognize an open set of objects and their associated attributes in an image. The objects and attributes are defined by text queries during inference, without prior knowledge of the tested classes during training.

Papers

Showing 110 of 14 papers

TitleStatusHype
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and GenerationCode5
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language ModelsCode4
Learning Transferable Visual Models From Natural Language SupervisionCode2
Bridging the Gap between Object and Image-level Representations for Open-Vocabulary DetectionCode2
Align before Fuse: Vision and Language Representation Learning with Momentum DistillationCode1
OvarNet: Towards Open-vocabulary Object Attribute RecognitionCode1
Reproducible scaling laws for contrastive language-image learningCode1
Localized Vision-Language Matching for Open-vocabulary Object DetectionCode1
Multi-Grained Vision Language Pre-Training: Aligning Texts with Visual ConceptsCode1
Open-ended VQA benchmarking of Vision-Language models by exploiting Classification datasets and their semantic hierarchyCode1
Show:102550
← PrevPage 1 of 2Next →

No leaderboard results yet.