CEPA: Consensus Embedded Perturbation for Agnostic Detection and Inversion of Backdoors

2024-02-03Unverified0· sign in to hype

Guangmingmei Yang, Xi Li, Hang Wang, David J. Miller, George Kesidis

Unverified — Be the first to reproduce this paper.

Abstract

A variety of defenses have been proposed against Trojans planted in (backdoor attacks on) deep neural network (DNN) classifiers. Backdoor-agnostic methods seek to reliably detect and/or to mitigate backdoors irrespective of the incorporation mechanism used by the attacker, while inversion methods explicitly assume one. In this paper, we describe a new detector that: relies on embedded feature representations to estimate (invert) the backdoor and to identify its target class; can operate without access to the training dataset; and is highly effective for various incorporation mechanisms (i.e., is backdoor agnostic). Our detection approach is evaluated -- and found to be favorable - in comparison with an array of published defenses for a variety of different attacks on the CIFAR-10 and CIFAR-100 image-classification domains.

Tasks

image-classification Image Classification

CEPA: Consensus Embedded Perturbation for Agnostic Detection and Inversion of Backdoors

Abstract

Tasks

Reproductions