Greedy Attack and Gumbel Attack: Generating Adversarial Examples for Discrete Data

2018-05-31Unverified0· sign in to hype

Puyudi Yang, Jianbo Chen, Cho-Jui Hsieh, Jane-Ling Wang, Michael. I. Jordan

Unverified — Be the first to reproduce this paper.

Abstract

We present a probabilistic framework for studying adversarial attacks on discrete data. Based on this framework, we derive a perturbation-based method, Greedy Attack, and a scalable learning-based method, Gumbel Attack, that illustrate various tradeoffs in the design of attacks. We demonstrate the effectiveness of these methods using both quantitative metrics and human evaluation on various state-of-the-art models for text classification, including a word-based CNN, a character-based CNN and an LSTM. As as example of our results, we show that the accuracy of character-based convolutional networks drops to the level of random selection by modifying only five characters through Greedy Attack.

Tasks

General Classification text-classification Text Classification

Greedy Attack and Gumbel Attack: Generating Adversarial Examples for Discrete Data

Abstract

Tasks

Reproductions