Bag-of-Words vs. Graph vs. Sequence in Text Classification: Questioning the Necessity of Text-Graphs and the Surprising Strength of a Wide MLP

2021-09-08ACL 2022Code Available1· sign in to hype

Lukas Galke, Ansgar Scherp

Code Available — Be the first to reproduce this paper.

Code

github.com/lgalke/text-clf-baselines
OfficialIn paperpytorch★ 29
github.com/sahanaramnath/bow-vs-graph-vs-seq-textclassification
pytorch★ 0

Abstract

Graph neural networks have triggered a resurgence of graph-based text classification methods, defining today's state of the art. We show that a wide multi-layer perceptron (MLP) using a Bag-of-Words (BoW) outperforms the recent graph-based models TextGCN and HeteGCN in an inductive text classification setting and is comparable with HyperGAT. Moreover, we fine-tune a sequence-based BERT and a lightweight DistilBERT model, which both outperform all state-of-the-art models. These results question the importance of synthetic graphs used in modern text classifiers. In terms of efficiency, DistilBERT is still twice as large as our BoW-based wide MLP, while graph-based models like TextGCN require setting up an O(N^2) graph, where N is the vocabulary plus corpus size. Finally, since Transformers need to compute O(L^2) attention weights with sequence length L, the MLP models show higher training and inference speeds on datasets with long sequences.

Tasks

text-classification Text Classification

Bag-of-Words vs. Graph vs. Sequence in Text Classification: Questioning the Necessity of Text-Graphs and the Surprising Strength of a Wide MLP

Code

Abstract

Tasks

Reproductions