MixMatch: A Holistic Approach to Semi-Supervised Learning

2019-05-06NeurIPS 2019Code Available1· sign in to hype

David Berthelot, Nicholas Carlini, Ian Goodfellow, Nicolas Papernot, Avital Oliver, Colin Raffel

Code Available — Be the first to reproduce this paper.

Code

github.com/google-research/mixmatch
OfficialIn papertf★ 0
github.com/google-research/crest
tf★ 100
github.com/rit-git/Snippext_public
pytorch★ 57
github.com/smkim7-kr/albu-MixMatch-pytorch
pytorch★ 2
github.com/yuxi120407/mixmatch_tensorflow
tf★ 0
github.com/kevinghst/mixmatch
pytorch★ 0
github.com/filaPro/visda2019
tf★ 0
github.com/FelixAbrahamsson/mixmatch-pytorch
pytorch★ 0
github.com/TianheWu/LGPNet
pytorch★ 0
github.com/ms903-github/MixMatch-imdb
pytorch★ 0

Abstract

Semi-supervised learning has proven to be a powerful paradigm for leveraging unlabeled data to mitigate the reliance on large labeled datasets. In this work, we unify the current dominant approaches for semi-supervised learning to produce a new algorithm, MixMatch, that works by guessing low-entropy labels for data-augmented unlabeled examples and mixing labeled and unlabeled data using MixUp. We show that MixMatch obtains state-of-the-art results by a large margin across many datasets and labeled data amounts. For example, on CIFAR-10 with 250 labels, we reduce error rate by a factor of 4 (from 38% to 11%) and by a factor of 2 on STL-10. We also demonstrate how MixMatch can help achieve a dramatically better accuracy-privacy trade-off for differential privacy. Finally, we perform an ablation study to tease apart which components of MixMatch are most important for its success.

Tasks

Image Classification Semi-Supervised Image Classification

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
CIFAR-10	MixMatch	Percentage correct	95.05	—	Unverified
CIFAR-100	MixMatch	Percentage correct	74.1	—	Unverified
STL-10	CutOut	Percentage correct	87.36	—	Unverified
STL-10	IIC	Percentage correct	88.8	—	Unverified
SVHN	MixMatch	Percentage error	2.59	—	Unverified

MixMatch: A Holistic Approach to Semi-Supervised Learning

Code

Abstract

Tasks

Benchmark Results

Reproductions