SOTAVerified

Pre-trained Multiple Latent Variable Generative Models are good defenders against Adversarial Attacks

2024-12-04Code Available0· sign in to hype

Dario Serez, Marco Cristani, Alessio Del Bue, Vittorio Murino, Pietro Morerio

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Attackers can deliberately perturb classifiers' input with subtle noise, altering final predictions. Among proposed countermeasures, adversarial purification employs generative networks to preprocess input images, filtering out adversarial noise. In this study, we propose specific generators, defined Multiple Latent Variable Generative Models (MLVGMs), for adversarial purification. These models possess multiple latent variables that naturally disentangle coarse from fine features. Taking advantage of these properties, we autoencode images to maintain class-relevant information, while discarding and re-sampling any detail, including adversarial noise. The procedure is completely training-free, exploring the generalization abilities of pre-trained MLVGMs on the adversarial purification downstream task. Despite the lack of large models, trained on billions of samples, we show that smaller MLVGMs are already competitive with traditional methods, and can be used as foundation models. Official code released at https://github.com/SerezD/gen_adversarial.

Tasks

Reproductions