Multi-way Encoding for Robustness to Adversarial Attacks

2019-05-01ICLR 2019Unverified0· sign in to hype

Donghyun Kim, Sarah Adel Bargal, Jianming Zhang, Stan Sclaroff

Unverified — Be the first to reproduce this paper.

Abstract

Deep models are state-of-the-art for many computer vision tasks including image classification and object detection. However, it has been shown that deep models are vulnerable to adversarial examples. We highlight how one-hot encoding directly contributes to this vulnerability and propose breaking away from this widely-used, but highly-vulnerable mapping. We demonstrate that by leveraging a different output encoding, multi-way encoding, we can make models more robust. Our approach makes it more difficult for adversaries to find useful gradients for generating adversarial attacks. We present state-of-the-art robustness results for black-box, white-box attacks, and achieve higher clean accuracy on four benchmark datasets: MNIST, CIFAR-10, CIFAR-100, and SVHN when combined with adversarial training. The strength of our approach is also presented in the form of an attack for model watermarking, raising challenges in detecting stolen models.

Tasks

image-classification Image Classification object-detection Object Detection

Multi-way Encoding for Robustness to Adversarial Attacks

Abstract

Tasks

Reproductions