Defending against Backdoor Attack on Deep Neural Networks

2020-02-26Unverified0· sign in to hype

Hao Cheng, Kaidi Xu, Sijia Liu, Pin-Yu Chen, Pu Zhao, Xue Lin

Unverified — Be the first to reproduce this paper.

Abstract

Although deep neural networks (DNNs) have achieved a great success in various computer vision tasks, it is recently found that they are vulnerable to adversarial attacks. In this paper, we focus on the so-called backdoor attack, which injects a backdoor trigger to a small portion of training data (also known as data poisoning) such that the trained DNN induces misclassification while facing examples with this trigger. To be specific, we carefully study the effect of both real and synthetic backdoor attacks on the internal response of vanilla and backdoored DNNs through the lens of Gard-CAM. Moreover, we show that the backdoor attack induces a significant bias in neuron activation in terms of the _ norm of an activation map compared to its _1 and _2 norm. Spurred by our results, we propose the _-based neuron pruning to remove the backdoor from the backdoored DNN. Experiments show that our method could effectively decrease the attack success rate, and also hold a high classification accuracy for clean images.

Tasks

Backdoor Attack Data Poisoning

Defending against Backdoor Attack on Deep Neural Networks

Abstract

Tasks

Reproductions