Better Diffusion Models Further Improve Adversarial Training
Zekai Wang, Tianyu Pang, Chao Du, Min Lin, Weiwei Liu, Shuicheng Yan
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/wzekai99/dm-improves-atOfficialIn paperpytorch★ 145
- github.com/poloclub/robust-principlespytorch★ 23
- github.com/bbartoldson/adversarial-robustness-limitspytorch★ 17
- github.com/BjoernNieth/LS-Dataset-pruning-in-ATpytorch★ 1
Abstract
It has been recognized that the data generated by the denoising diffusion probabilistic model (DDPM) improves adversarial training. After two years of rapid development in diffusion models, a question naturally arises: can better diffusion models further improve adversarial training? This paper gives an affirmative answer by employing the most recent diffusion model which has higher efficiency ( 20 sampling steps) and image quality (lower FID score) compared with DDPM. Our adversarially trained models achieve state-of-the-art performance on RobustBench using only generated data (no external datasets). Under the _-norm threat model with =8/255, our models achieve 70.69\% and 42.67\% robust accuracy on CIFAR-10 and CIFAR-100, respectively, i.e. improving upon previous state-of-the-art models by +4.58\% and +8.03\%. Under the _2-norm threat model with =128/255, our models achieve 84.86\% on CIFAR-10 (+4.44\%). These results also beat previous works that use external data. We also provide compelling results on the SVHN and TinyImageNet datasets. Our code is available at https://github.com/wzekai99/DM-Improves-AT.