Instant Adversarial Purification with Adversarial Consistency Distillation

2024-08-30CVPR 2025Unverified0· sign in to hype

Chun Tong Lei, Hon Ming Yam, Zhongliang Guo, Chun Pong Lau

Unverified — Be the first to reproduce this paper.

Abstract

Neural networks, despite their remarkable performance in widespread applications, including image classification, are also known to be vulnerable to subtle adversarial noise. Although some diffusion-based purification methods have been proposed, for example, DiffPure, those methods are time-consuming. In this paper, we propose One Step Control Purification (OSCP), a diffusion-based purification model that can purify the adversarial image in one Neural Function Evaluation (NFE) in diffusion models. We use Latent Consistency Model (LCM) and ControlNet for our one-step purification. OSCP is computationally friendly and time efficient compared to other diffusion-based purification methods; we achieve defense success rate of 74.19\% on ImageNet, only requiring 0.1s for each purification. Moreover, there is a fundamental incongruence between consistency distillation and adversarial perturbation. To address this ontological dissonance, we propose Gaussian Adversarial Noise Distillation (GAND), a novel consistency distillation framework that facilitates a more nuanced reconciliation of the latent space dynamics, effectively bridging the natural and adversarial manifolds. Our experiments show that the GAND does not need a Full Fine Tune (FFT); PEFT, e.g., LoRA is sufficient.

Tasks

Adversarial Purification image-classification Image Classification

Instant Adversarial Purification with Adversarial Consistency Distillation

Abstract

Tasks

Reproductions