PromptStyler: Prompt-driven Style Generation for Source-free Domain Generalization

2023-07-27ICCV 2023Code Available1· sign in to hype

Junhyeong Cho, Gilhyun Nam, Sungyeon Kim, Hunmin Yang, Suha Kwak

Code Available — Be the first to reproduce this paper.

Code

github.com/zhanghr2001/promptta
pytorch★ 15

Abstract

In a joint vision-language space, a text feature (e.g., from "a photo of a dog") could effectively represent its relevant image features (e.g., from dog photos). Also, a recent study has demonstrated the cross-modal transferability phenomenon of this joint space. From these observations, we propose PromptStyler which simulates various distribution shifts in the joint space by synthesizing diverse styles via prompts without using any images to deal with source-free domain generalization. The proposed method learns to generate a variety of style features (from "a S* style of a") via learnable style word vectors for pseudo-words S*. To ensure that learned styles do not distort content information, we force style-content features (from "a S* style of a [class]") to be located nearby their corresponding content features (from "[class]") in the joint vision-language space. After learning style word vectors, we train a linear classifier using synthesized style-content features. PromptStyler achieves the state of the art on PACS, VLCS, OfficeHome and DomainNet, even though it does not require any images for training.

Tasks

Domain Generalization Image Classification Multi-modal Classification Multimodal Deep Learning Out-of-Distribution Generalization Source-free Domain Generalization Zero-Shot Image Classification Zero-Shot Learning

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
DomainNet	PromptStyler (CLIP, ViT-L/14)	Average Accuracy	65.5	—	Unverified
DomainNet	PromptStyler (CLIP, ViT-B/16)	Average Accuracy	59.4	—	Unverified
DomainNet	PromptStyler (CLIP, ResNet-50)	Average Accuracy	49.5	—	Unverified
Office-Home	PromptStyler (CLIP, ViT-L/14)	Average Accuracy	89.1	—	Unverified
Office-Home	PromptStyler (CLIP, ViT-B/16)	Average Accuracy	83.6	—	Unverified
Office-Home	PromptStyler (CLIP, ResNet-50)	Average Accuracy	73.6	—	Unverified
PACS	PromptStyler (CLIP, ViT-L/14)	Average Accuracy	98.6	—	Unverified
PACS	PromptStyler (CLIP, ViT-B/16)	Average Accuracy	97.2	—	Unverified
PACS	PromptStyler (CLIP, ResNet-50)	Average Accuracy	93.2	—	Unverified
VLCS	PromptStyler (CLIP, ViT-B/16)	Average Accuracy	82.9	—	Unverified
VLCS	PromptStyler (CLIP, ViT-L/14)	Average Accuracy	82.4	—	Unverified
VLCS	PromptStyler (CLIP, ResNet-50)	Average Accuracy	82.3	—	Unverified

PromptStyler: Prompt-driven Style Generation for Source-free Domain Generalization

Code

Abstract

Tasks

Benchmark Results

Reproductions