SOTAVerified

Transformers are Short Text Classifiers: A Study of Inductive Short Text Classifiers on Benchmarks and Real-world Datasets

2022-11-30Code Available1· sign in to hype

Fabian Karl, Ansgar Scherp

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Short text classification is a crucial and challenging aspect of Natural Language Processing. For this reason, there are numerous highly specialized short text classifiers. However, in recent short text research, State of the Art (SOTA) methods for traditional text classification, particularly the pure use of Transformers, have been unexploited. In this work, we examine the performance of a variety of short text classifiers as well as the top performing traditional text classifier. We further investigate the effects on two new real-world short text datasets in an effort to address the issue of becoming overly dependent on benchmark datasets with a limited number of characteristics. Our experiments unambiguously demonstrate that Transformers achieve SOTA accuracy on short text classification tasks, raising the question of whether specialized short text techniques are necessary.

Tasks

Benchmark Results

DatasetModelMetricClaimedVerifiedStatus
MRRoBERTaAccuracy89.42Unverified
MRDistilBERTAccuracy85.31Unverified
MRALBERTv2Accuracy86.02Unverified
MRBERTAccuracy86.94Unverified
MRERNIE 2.0Accuracy88.97Unverified
MRDeBERTaAccuracy90.21Unverified
MRERNIE 2.0 (optimized)Accuracy89.53Unverified
NICE-2RoBERTaAccuracy99.76Unverified
NICE-45BERTAccuracy72.79Unverified
R8fastTextAccuracy96.13Unverified
R8DeBERTaAccuracy98.45Unverified
R8C-BERT (ESGNN + BERT)Accuracy98.28Unverified
R8ESGNNAccuracy98.23Unverified
R8BERTAccuracy98.17Unverified
R8SGNNAccuracy98.09Unverified
R8ERNIE 2.0Accuracy98.04Unverified
R8DistilBERTAccuracy97.98Unverified
R8ALBERTv2Accuracy97.62Unverified
R8WideMLPAccuracy96.98Unverified
SearchsnippetsBERTAccuracy88.2Unverified
SearchsnippetsDistilBERTAccuracy89.69Unverified
SST-2BERTAccuracy91.37Unverified
SST-2DeBERTaAccuracy94.78Unverified
STOPS-2ERNIE 2.0STOPS-299.88Unverified
STOPS-41DeBERTaAccuracy89.73Unverified
TREC-10BERTAccuracy99.4Unverified
TwitterERNIE 2.0Accuracy99.97Unverified
TwitterBERTAccuracy99.96Unverified
TwitterDistilBERTAccuracy99.96Unverified

Reproductions