Using Word Embeddings for Italian Crime News Categorization
2021-10-08Conference on Computer Science and Intelligence Systems (FedCSIS) 2021Unverified0· sign in to hype
Federica Rollo, Giovanni Bonisoli, Laura Po
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
Several studies have shown that the use of embeddings improves outcomes in many Natural Language Processing (NLP) activities, including text categorization. This paper focuses on how word embeddings can be used on newspaper articles related to crimes. The scope is the categorization of the news articles based on the type of crime they report. We compare different Word2Vec models and methods to obtain word embeddings. Then, we exploit both supervised and unsupervised Machine Learning categorization algorithms. Experiments were conducted on an Italian dataset of 15,361 crime news articles showing very promising results.