Visual Spoofing in content based spam detection
2020-04-11Unverified0· sign in to hype
Mark Sokolov, Kehinde Olufowobi, Nic Herndon
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
Although the problem of spam classification seems to be solved, there are still vulnerabilities in the current spam filters that could be easily exploited. We present one such vulnerability, in which one could replace some characters with corresponding characters from a different alphabet. These characters are visually similar, yet have a different Unicode encoding. With this approach spammers can create messages that bypass existing spam filters. Moreover, we show that this approach can be used to avoid plagiarism detection, and in other applications that use natural language processing for automatic analysis of text documents.