All You Need is "Leet": Evading Hate-speech Detection AI

2025-05-22Code Available0· sign in to hype

Sampanna Yashwant Kahu, Naman Ahuja

Code Available — Be the first to reproduce this paper.

Code

github.com/sampannakahu/all_you_need_is_leet
OfficialIn papernone★ 0

Abstract

Social media and online forums are increasingly becoming popular. Unfortunately, these platforms are being used for spreading hate speech. In this paper, we design black-box techniques to protect users from hate-speech on online platforms by generating perturbations that can fool state of the art deep learning based hate speech detection models thereby decreasing their efficiency. We also ensure a minimal change in the original meaning of hate-speech. Our best perturbation attack is successfully able to evade hate-speech detection for 86.8 % of hateful text.

Tasks

All Hate Speech Detection

All You Need is "Leet": Evading Hate-speech Detection AI

Code

Abstract

Tasks

Reproductions