SOTAVerified

Fusing Document, Collection and Label Graph-based Representations with Word Embeddings for Text Classification

2018-06-01WS 2018Code Available0· sign in to hype

Konstantinos Skianis, Fragkiskos Malliaros, Michalis Vazirgiannis

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Contrary to the traditional Bag-of-Words approach, we consider the Graph-of-Words(GoW) model in which each document is represented by a graph that encodes relationships between the different terms. Based on this formulation, the importance of a term is determined by weighting the corresponding node in the document, collection and label graphs, using node centrality criteria. We also introduce novel graph-based weighting schemes by enriching graphs with word-embedding similarities, in order to reward or penalize semantic relationships. Our methods produce more discriminative feature weights for text categorization, outperforming existing frequency-based criteria.

Tasks

Reproductions