Certified Adversarial Robustness via Randomized Smoothing
Jeremy M Cohen, Elan Rosenfeld, J. Zico Kolter
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/locuslab/smoothingOfficialIn paperpytorch★ 0
- github.com/sayakpaul/Denoised-Smoothing-TFtf★ 20
- github.com/akshaymehra24/poisoning_certified_defensestf★ 14
- github.com/llylly/dsrspytorch★ 9
- github.com/jayjaynandy/RBF-CNNtf★ 8
- github.com/aounon/distributional-robustnesspytorch★ 3
- github.com/alevine0/smoothingGenGaussianpytorch★ 3
- github.com/RaphaelOlivier/gard_eval2_publicpytorch★ 0
- github.com/xzh0u/randomized-smoothingpytorch★ 0
- github.com/mwojnars/niftytf★ 0
Abstract
We show how to turn any classifier that classifies well under Gaussian noise into a new classifier that is certifiably robust to adversarial perturbations under the _2 norm. This "randomized smoothing" technique has been proposed recently in the literature, but existing guarantees are loose. We prove a tight robustness guarantee in _2 norm for smoothing with Gaussian noise. We use randomized smoothing to obtain an ImageNet classifier with e.g. a certified top-1 accuracy of 49% under adversarial perturbations with _2 norm less than 0.5 (=127/255). No certified defense has been shown feasible on ImageNet except for smoothing. On smaller-scale datasets where competing approaches to certified _2 robustness are viable, smoothing delivers higher certified accuracies. Our strong empirical results suggest that randomized smoothing is a promising direction for future research into adversarially robust classification. Code and models are available at http://github.com/locuslab/smoothing.