Learning-based Memetic Algorithm for Hard-label Textual Attack
Anonymous
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
Deep neural networks are widely known to be vulnerable to adversarial examples in Natural Language Processing. However, existing textual adversarial attacks usually utilize the gradient or prediction confidence to generate adversarial examples, making it hard to be applied in real-world applications. To this end, we consider a more rigorous setting, namely hard-label attack, in which the attacker could only access the prediction labels. There are only a few hard-label attacks proposed currently, among which the one based on genetic algorithm exhibits high attack performance. It inspires us to design a new hard-label attack for better performance based on a combinatorial optimization approach. In this work, we propose a novel hard-label attack, named Learning-based Memetic Algorithm (LMA), which integrates the word importance learned from the attack history into the search of memetic algorithm to optimize the adversary perturbation. Extensive evaluations for text classification and textual entailment using various datasets and models demonstrate that the proposed LMA significantly outperforms existing hard-label attack regarding attack performance and adversary quality.