Adversarial Training: A simple and efficient technique to Improving NLP Robustness

2021-09-29Unverified0· sign in to hype

marwan omar

Unverified — Be the first to reproduce this paper.

Abstract

NLP models are shown to be prone to adversarial attacks which undermines their robustness, i.e. a small perturbation to the input text can fool an NLP model to incorrectly classify text. In this study, we present a new Adversarial Text Generation technique that, given an input text, generates adversarial texts through quickly and efficiently. For example, in order to attack a model for sentiment classification, we can use the product categories as the attribute which should not change the sentiment of the reviews. We conducted experiments on real-world NLP datasets to demonstrate that our technique can generate more meaningful and diverse adversarial texts, compared to many existing adversarial text generation approaches. We further use our generated adversarial examples to improve models through adversarial training, and we demonstrate that our generated attacks are more robust against model re training and different model architectures.

Tasks

Adversarial Text Attribute Sentiment Analysis Sentiment Classification Text Generation

Adversarial Training: A simple and efficient technique to Improving NLP Robustness

Abstract

Tasks

Reproductions