Denoised Smoothing: A Provable Defense for Pretrained Classifiers

2020-03-04NeurIPS 2020Code Available1· sign in to hype

Hadi Salman, Ming-Jie Sun, Greg Yang, Ashish Kapoor, J. Zico Kolter

Code Available — Be the first to reproduce this paper.

Code

github.com/microsoft/blackbox-smoothing
OfficialIn paperpytorch★ 100
github.com/microsoft/denoised-smoothing
OfficialIn paperpytorch★ 100
github.com/locuslab/breaking-poisoned-classifier
pytorch★ 26
github.com/sayakpaul/Denoised-Smoothing-TF
tf★ 20

Abstract

We present a method for provably defending any pretrained image classifier against _p adversarial attacks. This method, for instance, allows public vision API providers and users to seamlessly convert pretrained non-robust classification services into provably robust ones. By prepending a custom-trained denoiser to any off-the-shelf image classifier and using randomized smoothing, we effectively create a new classifier that is guaranteed to be _p-robust to adversarial examples, without modifying the pretrained classifier. Our approach applies to both the white-box and the black-box settings of the pretrained classifier. We refer to this defense as denoised smoothing, and we demonstrate its effectiveness through extensive experimentation on ImageNet and CIFAR-10. Finally, we use our approach to provably defend the Azure, Google, AWS, and ClarifAI image classification APIs. Our code replicating all the experiments in the paper can be found at: https://github.com/microsoft/denoised-smoothing.

Tasks

General Classification image-classification Image Classification Robust classification

Denoised Smoothing: A Provable Defense for Pretrained Classifiers

Code

Abstract

Tasks

Reproductions