Robustness May Be at Odds with Accuracy

2018-05-30ICLR 2019Code Available1· sign in to hype

Dimitris Tsipras, Shibani Santurkar, Logan Engstrom, Alexander Turner, Aleksander Madry

Code Available — Be the first to reproduce this paper.

Code

github.com/louis2889184/pytorch-adversarial-training
pytorch★ 254
github.com/conference-submission-anon/LAT_adversarial_robustness
tf★ 0
github.com/XgDuan/pytorch-adversarial-training-nonexpansive
pytorch★ 0
github.com/louis2889184/adversarial_training
pytorch★ 0
github.com/msingh27/LAT_adversarial_robustness
tf★ 0
github.com/AugustineCha/pytorch-adversarial-training-master
pytorch★ 0
github.com/MadryLab/robust-features-code
tf★ 0

Abstract

We show that there may exist an inherent tension between the goal of adversarial robustness and that of standard generalization. Specifically, training robust models may not only be more resource-consuming, but also lead to a reduction of standard accuracy. We demonstrate that this trade-off between the standard accuracy of a model and its robustness to adversarial perturbations provably exists in a fairly simple and natural setting. These findings also corroborate a similar phenomenon observed empirically in more complex settings. Further, we argue that this phenomenon is a consequence of robust classifiers learning fundamentally different feature representations than standard classifiers. These differences, in particular, seem to result in unexpected benefits: the representations learned by robust models tend to align better with salient data characteristics and human perception.

Tasks

Adversarial Robustness

Robustness May Be at Odds with Accuracy

Code

Abstract

Tasks

Reproductions