SOTAVerified

Robustness Analysis of Grover for Machine-Generated News Detection

2021-12-01ALTA 2021Unverified0· sign in to hype

Rinaldo Gagiano, Maria Myung-Hee Kim, Xiuzhen Zhang, Jennifer Biggs

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

Advancements in Natural Language Generation have raised concerns on its potential misuse for deep fake news. Grover is a model for both generation and detection of neural fake news. While its performance on automatically discriminating neural fake news surpassed GPT-2 and BERT, Grover could face a variety of adversarial attacks to deceive detection. In this work, we present an investigation of Grover’s susceptibility to adversarial attacks such as character-level and word-level perturbations. The experiment results show that even a singular character alteration can cause Grover to fail, affecting up to 97% of target articles with unlimited attack attempts, exposing a lack of robustness. We further analyse these misclassified cases to highlight affected words, identify vulnerability within Grover’s encoder, and perform a novel visualisation of cumulative classification scores to assist in interpreting model behaviour.

Tasks

Reproductions