Benchmarking Neural Network Robustness to Common Corruptions and Perturbations

2019-03-28ICLR 2019Code Available2· sign in to hype

Dan Hendrycks, Thomas Dietterich

Code Available — Be the first to reproduce this paper.

Code

github.com/hendrycks/robustness
OfficialIn paperpytorch★ 1,139
github.com/mr-eggplant/sar
pytorch★ 208
github.com/mr-eggplant/eata
pytorch★ 138
github.com/YutingLi0606/SURE
pytorch★ 75
github.com/automl/nes
pytorch★ 33
github.com/allenai/robustnav
pytorch★ 31
github.com/yaodongyu/projnorm
pytorch★ 19
github.com/MKYucel/zero_shot_corruption_benchmarks
pytorch★ 5
github.com/yueatsprograms/ttt_cifar_release
pytorch★ 0
github.com/EPFL-VILAB/XDEnsembles
pytorch★ 0

Abstract

In this paper we establish rigorous benchmarks for image classifier robustness. Our first benchmark, ImageNet-C, standardizes and expands the corruption robustness topic, while showing which classifiers are preferable in safety-critical applications. Then we propose a new dataset called ImageNet-P which enables researchers to benchmark a classifier's robustness to common perturbations. Unlike recent robustness research, this benchmark evaluates performance on common corruptions and perturbations not worst-case adversarial perturbations. We find that there are negligible changes in relative corruption robustness from AlexNet classifiers to ResNet classifiers. Afterward we discover ways to enhance corruption and perturbation robustness. We even find that a bypassed adversarial defense provides substantial common perturbation robustness. Together our benchmarks may aid future work toward networks that robustly generalize.

Tasks

Adversarial Defense Benchmarking Domain Generalization

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
ImageNet-C	ResNet-50	mean Corruption Error (mCE)	76.7	—	Unverified

Benchmarking Neural Network Robustness to Common Corruptions and Perturbations

Code

Abstract

Tasks

Benchmark Results

Reproductions