Attention Augmented Convolutional Networks

2019-04-22ICCV 2019Code Available0· sign in to hype

Irwan Bello, Barret Zoph, Ashish Vaswani, Jonathon Shlens, Quoc V. Le

Code Available — Be the first to reproduce this paper.

Code

github.com/MartinGer/Bottleneck-Transformers-for-Visual-Recognition
pytorch★ 7
github.com/MartinGer/Attention-Augmented-Convolutional-Networks
pytorch★ 0
github.com/sebastiani/pytorch-attention-augmented-convolution
pytorch★ 0
github.com/leaderj1001/Attention-Augmented-Conv2d
pytorch★ 0
github.com/khwajawisal/Attention-augmented-Convolutional-Neural-Networks
tf★ 0
github.com/Data-drone/attention_augmented_cnn
pytorch★ 0
github.com/JinLi711/Convolution_Variants
tf★ 0
github.com/titu1994/keras-attention-augmented-convs
tf★ 0
github.com/lschirmer/Attention-Augmented-Convolutional-Keras-Networks
tf★ 0
github.com/infinitemugen/Attention-Conv-Pytorch
pytorch★ 0

Abstract

Convolutional networks have been the paradigm of choice in many computer vision applications. The convolution operation however has a significant weakness in that it only operates on a local neighborhood, thus missing global information. Self-attention, on the other hand, has emerged as a recent advance to capture long range interactions, but has mostly been applied to sequence modeling and generative modeling tasks. In this paper, we consider the use of self-attention for discriminative visual tasks as an alternative to convolutions. We introduce a novel two-dimensional relative self-attention mechanism that proves competitive in replacing convolutions as a stand-alone computational primitive for image classification. We find in control experiments that the best results are obtained when combining both convolutions and self-attention. We therefore propose to augment convolutional operators with this self-attention mechanism by concatenating convolutional feature maps with a set of feature maps produced via self-attention. Extensive experiments show that Attention Augmentation leads to consistent improvements in image classification on ImageNet and object detection on COCO across many different models and scales, including ResNets and a state-of-the art mobile constrained network, while keeping the number of parameters similar. In particular, our method achieves a 1.3\% top-1 accuracy improvement on ImageNet classification over a ResNet50 baseline and outperforms other attention mechanisms for images such as Squeeze-and-Excitation. It also achieves an improvement of 1.4 mAP in COCO Object Detection on top of a RetinaNet baseline.

Tasks

General Classification image-classification Image Classification object-detection Object Detection

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
CIFAR-100	AA-Wide-ResNet	Percentage correct	81.6	—	Unverified
ImageNet	AA-ResNet-152	Top 1 Accuracy	79.1	—	Unverified

Attention Augmented Convolutional Networks

Code

Abstract

Tasks

Benchmark Results

Reproductions