SOTAVerified

A principled approach for generating adversarial images under non-smooth dissimilarity metrics

2019-08-05Code Available0· sign in to hype

Aram-Alexandre Pooladian, Chris Finlay, Tim Hoheisel, Adam Oberman

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Deep neural networks perform well on real world data but are prone to adversarial perturbations: small changes in the input easily lead to misclassification. In this work, we propose an attack methodology not only for cases where the perturbations are measured by _p norms, but in fact any adversarial dissimilarity metric with a closed proximal form. This includes, but is not limited to, _1, _2, and _ perturbations; the _0 counting "norm" (i.e. true sparseness); and the total variation seminorm, which is a (non-_p) convolutional dissimilarity measuring local pixel changes. Our approach is a natural extension of a recent adversarial attack method, and eliminates the differentiability requirement of the metric. We demonstrate our algorithm, ProxLogBarrier, on the MNIST, CIFAR10, and ImageNet-1k datasets. We consider undefended and defended models, and show that our algorithm easily transfers to various datasets. We observe that ProxLogBarrier outperforms a host of modern adversarial attacks specialized for the _0 case. Moreover, by altering images in the total variation seminorm, we shed light on a new class of perturbations that exploit neighboring pixel information.

Tasks

Reproductions