Read-only Prompt Optimization for Vision-Language Few-shot Learning

2023-08-29ICCV 2023Code Available1· sign in to hype

Dongjun Lee, Seokwon Song, Jihee Suh, Joonmyung Choi, Sanghyeok Lee, Hyunwoo J. Kim

Code Available — Be the first to reproduce this paper.

Code

github.com/mlvlab/rpo
OfficialIn paperpytorch★ 55

Abstract

In recent years, prompt tuning has proven effective in adapting pre-trained vision-language models to downstream tasks. These methods aim to adapt the pre-trained models by introducing learnable prompts while keeping pre-trained weights frozen. However, learnable prompts can affect the internal representation within the self-attention module, which may negatively impact performance variance and generalization, especially in data-deficient settings. To address these issues, we propose a novel approach, Read-only Prompt Optimization (RPO). RPO leverages masked attention to prevent the internal representation shift in the pre-trained model. Further, to facilitate the optimization of RPO, the read-only prompts are initialized based on special tokens of the pre-trained model. Our extensive experiments demonstrate that RPO outperforms CLIP and CoCoOp in base-to-new generalization and domain generalization while displaying better robustness. Also, the proposed method achieves better generalization on extremely data-deficient settings, while improving parameter efficiency and computational overhead. Code is available at https://github.com/mlvlab/RPO.

Tasks

Domain Generalization Few-Shot Learning Prompt Engineering

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
Caltech-101	RPO	Harmonic mean	96.03	—	Unverified
DTD	RPO	Harmonic mean	68.61	—	Unverified
EuroSAT	RPO	Harmonic mean	76.79	—	Unverified
FGVC-Aircraft	RPO	Harmonic mean	35.7	—	Unverified
Food-101	RPO	Harmonic mean	90.58	—	Unverified
ImageNet	RPO	Harmonic mean	74	—	Unverified
Oxford 102 Flower	RPO	Harmonic mean	84.5	—	Unverified
Oxford-IIIT Pet Dataset	RPO	Harmonic mean	96.05	—	Unverified
Stanford Cars	RPO	Harmonic mean	74.69	—	Unverified
SUN397	RPO	Harmonic mean	79.18	—	Unverified
UCF101	RPO	Harmonic mean	79.34	—	Unverified

Read-only Prompt Optimization for Vision-Language Few-shot Learning

Code

Abstract

Tasks

Benchmark Results

Reproductions