Removing Out-of-Distribution Data Improves Adversarial Robustness
Anonymous
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
Deep neural networks are vulnerable to adversarial examples crafted maliciously. Existing defense methods often improve the adversarial robustness of models by enlarging the training set with adversarial examples and fitting the models on the augmented training set. The introduced adversarial examples in the training stage may not be natural, which might even hurt the training distribution, resulting in lowter performance in the clean examples. Increasing the size of the training set also requires a long training time.We hypothesize that removing out-of-distribution samples from the training set could make the decision boundary of models smoother, leading to more robust models.We propose two methods to detect and remove out-of-distribution samples. Experimental results show that our methods can significantly boost the robustness of models without suffering any drop in clean accuracy.