Domain Generalization by Mutual-Information Regularization with Pre-trained Models

2022-03-21Code Available1· sign in to hype

Junbum Cha, Kyungjae Lee, Sungrae Park, Sanghyuk Chun

Code Available — Be the first to reproduce this paper.

Code

github.com/kakaobrain/miro
OfficialIn paperpytorch★ 89

Abstract

Domain generalization (DG) aims to learn a generalized model to an unseen target domain using only limited source domains. Previous attempts to DG fail to learn domain-invariant representations only from the source domains due to the significant domain shifts between training and test domains. Instead, we re-formulate the DG objective using mutual information with the oracle model, a model generalized to any possible domain. We derive a tractable variational lower bound via approximating the oracle model by a pre-trained model, called Mutual Information Regularization with Oracle (MIRO). Our extensive experiments show that MIRO significantly improves the out-of-distribution performance. Furthermore, our scaling experiments show that the larger the scale of the pre-trained model, the greater the performance improvement of MIRO. Source code is available at https://github.com/kakaobrain/miro.

Tasks

Domain Generalization

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
DomainNet	MIRO (RegNetY-16GF, SWAD)	Average Accuracy	60.7	—	Unverified
DomainNet	MIRO (ResNet-50, SWAD)	Average Accuracy	47	—	Unverified
Office-Home	MIRO (RegNetY-16GF, SWAD)	Average Accuracy	83.3	—	Unverified
Office-Home	MIRO (ResNet-50, SWAD)	Average Accuracy	72.4	—	Unverified
PACS	MIRO (RegNetY-16GF, SWAD)	Average Accuracy	96.8	—	Unverified
PACS	MIRO (ResNet-50, SWAD)	Average Accuracy	88.4	—	Unverified
TerraIncognita	MIRO (RegNetY-16GF, SWAD)	Average Accuracy	64.3	—	Unverified
TerraIncognita	MIRO (ResNet-50, SWAD)	Average Accuracy	52.9	—	Unverified
VLCS	MIRO (RegNetY-16GF, SWAD)	Average Accuracy	81.7	—	Unverified
VLCS	MIRO (ResNet-50, SWAD)	Average Accuracy	79.6	—	Unverified

Domain Generalization by Mutual-Information Regularization with Pre-trained Models

Code

Abstract

Tasks

Benchmark Results

Reproductions