Adversarial Semantic Collisions

2020-11-09EMNLP 2020Code Available1· sign in to hype

Congzheng Song, Alexander M. Rush, Vitaly Shmatikov

Code Available — Be the first to reproduce this paper.

Code

github.com/csong27/collision-bert
OfficialIn paperpytorch★ 25

Abstract

We study semantic collisions: texts that are semantically unrelated but judged as similar by NLP models. We develop gradient-based approaches for generating semantic collisions and demonstrate that state-of-the-art models for many tasks which rely on analyzing the meaning and similarity of texts-- including paraphrase identification, document retrieval, response suggestion, and extractive summarization-- are vulnerable to semantic collisions. For example, given a target query, inserting a crafted collision into an irrelevant document can shift its retrieval rank from 1000 to top 3. We show how to generate semantic collisions that evade perplexity-based filtering and discuss other potential mitigations. Our code is available at https://github.com/csong27/collision-bert.

Tasks

Extractive Summarization Paraphrase Identification Retrieval

Adversarial Semantic Collisions

Code

Abstract

Tasks

Reproductions