SOTAVerified

Text Detoxification as Style Transfer in English and Hindi

2024-02-12Code Available0· sign in to hype

Sourabrata Mukherjee, Akanksha Bansal, Atul Kr. Ojha, John P. McCrae, Ondřej Dušek

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

This paper focuses on text detoxification, i.e., automatically converting toxic text into non-toxic text. This task contributes to safer and more respectful online communication and can be considered a Text Style Transfer (TST) task, where the text style changes while its content is preserved. We present three approaches: knowledge transfer from a similar task, multi-task learning approach, combining sequence-to-sequence modeling with various toxicity classification tasks, and delete and reconstruct approach. To support our research, we utilize a dataset provided by Dementieva et al.(2021), which contains multiple versions of detoxified texts corresponding to toxic texts. In our experiments, we selected the best variants through expert human annotators, creating a dataset where each toxic sentence is paired with a single, appropriate detoxified version. Additionally, we introduced a small Hindi parallel dataset, aligning with a part of the English dataset, suitable for evaluation purposes. Our results demonstrate that our approach effectively balances text detoxication while preserving the actual content and maintaining fluency.

Tasks

Reproductions