ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness

2018-11-29ICLR 2019Code Available1· sign in to hype

Robert Geirhos, Patricia Rubisch, Claudio Michaelis, Matthias Bethge, Felix A. Wichmann, Wieland Brendel

Code Available — Be the first to reproduce this paper.

Code

github.com/rgeirhos/Stylized-ImageNet
OfficialIn paperpytorch★ 0
github.com/LiYingwei/ShapeTextureDebiasedTraining
pytorch★ 111
github.com/rgeirhos/texture-vs-shape
In paperpytorch★ 0
github.com/annstrange/breast-cancer-cnn
tf★ 0
github.com/mbuet2ner/local-global-features-cnn
pytorch★ 0
github.com/frank-roesler/Image_Segmentation
pytorch★ 0

Abstract

Convolutional Neural Networks (CNNs) are commonly thought to recognise objects by learning increasingly complex representations of object shapes. Some recent studies suggest a more important role of image textures. We here put these conflicting hypotheses to a quantitative test by evaluating CNNs and human observers on images with a texture-shape cue conflict. We show that ImageNet-trained CNNs are strongly biased towards recognising textures rather than shapes, which is in stark contrast to human behavioural evidence and reveals fundamentally different classification strategies. We then demonstrate that the same standard architecture (ResNet-50) that learns a texture-based representation on ImageNet is able to learn a shape-based representation instead when trained on "Stylized-ImageNet", a stylized version of ImageNet. This provides a much better fit for human behavioural performance in our well-controlled psychophysical lab setting (nine experiments totalling 48,560 psychophysical trials across 97 observers) and comes with a number of unexpected emergent benefits such as improved object detection performance and previously unseen robustness towards a wide range of image distortions, highlighting advantages of a shape-based representation.

Tasks

Domain Generalization Image Classification object-detection Object Detection Object Recognition Out-of-Distribution Generalization

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
ImageNet-A	Stylized ImageNet (ResNet-50)	Top-1 accuracy %	2.3	—	Unverified
ImageNet-C	Stylized ImageNet (ResNet-50)	mean Corruption Error (mCE)	69.3	—	Unverified
ImageNet-R	Stylized ImageNet (ResNet-50)	Top-1 Error Rate	58.5	—	Unverified
VizWiz-Classification	ResNet-50 (SIN)	Accuracy - All Images	25.3	—	Unverified
VizWiz-Classification	ResNet-50 (SIN_IN_IN)	Accuracy - All Images	39.2	—	Unverified
VizWiz-Classification	ResNet-50 (SIN_IN)	Accuracy - All Images	38.2	—	Unverified

ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness

Code

Abstract

Tasks

Benchmark Results

Reproductions