ResMLP: Feedforward networks for image classification with data-efficient training
Hugo Touvron, Piotr Bojanowski, Mathilde Caron, Matthieu Cord, Alaaeldin El-Nouby, Edouard Grave, Gautier Izacard, Armand Joulin, Gabriel Synnaeve, Jakob Verbeek, Hervé Jégou
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/facebookresearch/deitOfficialpytorch★ 4,327
- github.com/rwightman/pytorch-image-modelsIn paperpytorch★ 36,538
- github.com/xmu-xiaoma666/External-Attention-pytorchpytorch★ 12,169
- github.com/martinsbruveris/tensorflow-image-modelstf★ 291
- github.com/Mayurji/Image-Classification-PyTorchpytorch★ 219
- github.com/lucidrains/res-mlp-pytorchpytorch★ 201
- github.com/liuruiyang98/Jittor-MLPjax★ 170
- github.com/lalithjets/surgical_vqapytorch★ 63
- github.com/rishikksh20/ResMLP-pytorchpytorch★ 45
- github.com/leaderj1001/Bag-of-MLPpytorch★ 20
Abstract
We present ResMLP, an architecture built entirely upon multi-layer perceptrons for image classification. It is a simple residual network that alternates (i) a linear layer in which image patches interact, independently and identically across channels, and (ii) a two-layer feed-forward network in which channels interact independently per patch. When trained with a modern training strategy using heavy data-augmentation and optionally distillation, it attains surprisingly good accuracy/complexity trade-offs on ImageNet. We also train ResMLP models in a self-supervised setup, to further remove priors from employing a labelled dataset. Finally, by adapting our model to machine translation we achieve surprisingly good results. We share pre-trained models and our code based on the Timm library.
Tasks
Benchmark Results
| Dataset | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| Oxford 102 Flowers | ResMLP-12 | Accuracy | 97.4 | — | Unverified |
| Oxford 102 Flowers | ResMLP-24 | Accuracy | 97.9 | — | Unverified |