Defending Against Universal Perturbations With Shared Adversarial Training

2018-12-10ICCV 2019Unverified0· sign in to hype

Chaithanya Kumar Mummadi, Thomas Brox, Jan Hendrik Metzen

Unverified — Be the first to reproduce this paper.

Abstract

Classifiers such as deep neural networks have been shown to be vulnerable against adversarial perturbations on problems with high-dimensional input space. While adversarial training improves the robustness of image classifiers against such adversarial perturbations, it leaves them sensitive to perturbations on a non-negligible fraction of the inputs. In this work, we show that adversarial training is more effective in preventing universal perturbations, where the same perturbation needs to fool a classifier on many inputs. Moreover, we investigate the trade-off between robustness against universal perturbations and performance on unperturbed data and propose an extension of adversarial training that handles this trade-off more gracefully. We present results for image classification and semantic segmentation to showcase that universal perturbations that fool a model hardened with adversarial training become clearly perceptible and show patterns of the target scene.

Tasks

image-classification Image Classification Semantic Segmentation

Defending Against Universal Perturbations With Shared Adversarial Training

Abstract

Tasks

Reproductions