MixMo: Mixing Multiple Inputs for Multiple Outputs via Deep Subnetworks

2021-03-10ICCV 2021Code Available1· sign in to hype

Alexandre Rame, Remy Sun, Matthieu Cord

Code Available — Be the first to reproduce this paper.

Code

github.com/alexrame/mixmo-pytorch
Officialpytorch★ 83

Abstract

Recent strategies achieved ensembling "for free" by fitting concurrently diverse subnetworks inside a single base network. The main idea during training is that each subnetwork learns to classify only one of the multiple inputs simultaneously provided. However, the question of how to best mix these multiple inputs has not been studied so far. In this paper, we introduce MixMo, a new generalized framework for learning multi-input multi-output deep subnetworks. Our key motivation is to replace the suboptimal summing operation hidden in previous approaches by a more appropriate mixing mechanism. For that purpose, we draw inspiration from successful mixed sample data augmentations. We show that binary mixing in features - particularly with rectangular patches from CutMix - enhances results by making subnetworks stronger and more diverse. We improve state of the art for image classification on CIFAR-100 and Tiny ImageNet datasets. Our easy to implement models notably outperform data augmented deep ensembles, without the inference and memory overheads. As we operate in features and simply better leverage the expressiveness of large networks, we open a new line of research complementary to previous works.

Tasks

image-classification Image Classification

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
CIFAR-10	WRN-28-10	Percentage correct	97.73	—	Unverified
CIFAR-100	WRN-28-10 * 3	Percentage correct	86.81	—	Unverified
CIFAR-100	WRN-28-10	Percentage correct	85.77	—	Unverified
Tiny ImageNet Classification	PreActResNet-18-3	Validation Acc	70.24	—	Unverified

MixMo: Mixing Multiple Inputs for Multiple Outputs via Deep Subnetworks

Code

Abstract

Tasks

Benchmark Results

Reproductions