Comparison of String Similarity Measures for Obscenity Filtering

2017-04-01WS 2017Unverified0· sign in to hype

Ekaterina Chernyak

Unverified — Be the first to reproduce this paper.

Abstract

In this paper we address the problem of filtering obscene lexis in Russian texts. We use string similarity measures to find words similar or identical to words from a stop list and establish both a test collection and a baseline for the task. Our experiments show that a novel string similarity measure based on the notion of an annotated suffix tree outperforms some of the other well known measures.

Tasks

Information Retrieval Sentiment Analysis Spelling Correction

Comparison of String Similarity Measures for Obscenity Filtering

Abstract

Tasks

Reproductions