Generalizing to unseen domains via distribution matching

2019-11-03Code Available0· sign in to hype

Isabela Albuquerque, João Monteiro, Mohammad Darvishi, Tiago H. Falk, Ioannis Mitliagkas

Code Available — Be the first to reproduce this paper.

Code

github.com/belaalb/TI-DG
pytorch★ 0
github.com/belaalb/g2dm
pytorch★ 0

Abstract

Supervised learning results typically rely on assumptions of i.i.d. data. Unfortunately, those assumptions are commonly violated in practice. In this work, we tackle such problem by focusing on domain generalization: a formalization where the data generating process at test time may yield samples from never-before-seen domains (distributions). Our work relies on the following lemma: by minimizing a notion of discrepancy between all pairs from a set of given domains, we also minimize the discrepancy between any pairs of mixtures of domains. Using this result, we derive a generalization bound for our setting. We then show that low risk over unseen domains can be achieved by representing the data in a space where (i) the training distributions are indistinguishable, and (ii) relevant information for the task at hand is preserved. Minimizing the terms in our bound yields an adversarial formulation which estimates and minimizes pairwise discrepancies. We validate our proposed strategy on standard domain generalization benchmarks, outperforming a number of recently introduced methods. Notably, we tackle a real-world application where the underlying data corresponds to multi-channel electroencephalography time series from different subjects, each considered as a distinct domain.

Tasks

Domain Generalization LEMMA Object Recognition Representation Learning Time Series Time Series Analysis

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
PACS	G2DM (ResNet-18)	Average Accuracy	83.34	—	Unverified
PACS	G2DM (AlexNet)	Average Accuracy	73.55	—	Unverified

Generalizing to unseen domains via distribution matching

Code

Abstract

Tasks

Benchmark Results

Reproductions