SOTAVerified

Measuring Robustness to Natural Distribution Shifts in Image Classification

2020-07-01NeurIPS 2020Code Available1· sign in to hype

Rohan Taori, Achal Dave, Vaishaal Shankar, Nicholas Carlini, Benjamin Recht, Ludwig Schmidt

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

We study how robust current ImageNet models are to distribution shifts arising from natural variations in datasets. Most research on robustness focuses on synthetic image perturbations (noise, simulated weather artifacts, adversarial examples, etc.), which leaves open how robustness on synthetic distribution shift relates to distribution shift arising in real data. Informed by an evaluation of 204 ImageNet models in 213 different test conditions, we find that there is often little to no transfer of robustness from current synthetic to natural distribution shift. Moreover, most current techniques provide no robustness to the natural distribution shifts in our testbed. The main exception is training on larger and more diverse datasets, which in multiple cases increases robustness, but is still far from closing the performance gaps. Our results indicate that distribution shifts arising in real data are currently an open research problem. We provide our testbed and data as a resource for future work at https://modestyachts.github.io/imagenet-testbed/ .

Tasks

Benchmark Results

DatasetModelMetricClaimedVerifiedStatus
VizWiz-ClassificationResNet-50 (IN-C)Accuracy - All Images38.8Unverified
VizWiz-ClassificationResNet-50 (IN-C_brightness)Accuracy - All Images38.8Unverified
VizWiz-ClassificationResNet-50 (IN-C_spatter)Accuracy - All Images38.3Unverified
VizWiz-ClassificationResNet-50 (IN-C_saturate)Accuracy - All Images38.2Unverified
VizWiz-ClassificationResNet-50 (IN-C_pixelate)Accuracy - All Images37.4Unverified
VizWiz-ClassificationResNet-50 (IN-C_contrast)Accuracy - All Images36.5Unverified
VizWiz-ClassificationResNet-50 (IN-C_jpeg_compression)Accuracy - All Images36.5Unverified
VizWiz-ClassificationResNet-50 (IN-C_gaussian_noise)Accuracy - All Images36.4Unverified
VizWiz-ClassificationResNet-50 (IN-C_frost)Accuracy - All Images36.1Unverified
VizWiz-ClassificationResNet-50 (IN-C_fog_aws)Accuracy - All Images35.9Unverified
VizWiz-ClassificationResNet-50 (IN-C_motion_blur)Accuracy - All Images35.7Unverified
VizWiz-ClassificationResNet-50 (IN-C_zoom_blur)Accuracy - All Images32.7Unverified
VizWiz-ClassificationResNet-50 (IN-C_greyscale)Accuracy - All Images30.2Unverified

Reproductions