Bag of Tricks for Efficient Text Classification
2016-07-06EACL 2017Code Available1· sign in to hype
Armand Joulin, Edouard Grave, Piotr Bojanowski, Tomas Mikolov
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/facebookresearch/fastTextOfficialIn papernone★ 0
- github.com/graykode/nlp-tutorialpytorch★ 14,880
- github.com/TsingZ0/PFL-Non-IIDpytorch★ 2,091
- github.com/currentsapi/extractnetnone★ 300
- github.com/csebuetnlp/xl-sumjax★ 277
- github.com/gmichalo/question_identification_on_medical_logspytorch★ 2
- github.com/brightmart/text_classificationtf★ 0
- github.com/mindspore-ai/models/tree/master/official/nlp/fasttextmindspore★ 0
- github.com/M155K4R4/fastTextnone★ 0
- github.com/2023-MindSpore-1/ms-code-200mindspore★ 0
Abstract
This paper explores a simple and efficient baseline for text classification. Our experiments show that our fast text classifier fastText is often on par with deep learning classifiers in terms of accuracy, and many orders of magnitude faster for training and evaluation. We can train fastText on more than one billion words in less than ten minutes using a standard multicore~CPU, and classify half a million sentences among~312K classes in less than a minute.
Tasks
Benchmark Results
| Dataset | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| Amazon Review Full | FastText | Accuracy | 60.2 | — | Unverified |
| Amazon Review Polarity | FastText | Accuracy | 94.6 | — | Unverified |
| Sogou News | fastText, h=10, bigram | Accuracy | 96.8 | — | Unverified |
| Yelp Binary classification | fastText, h=10, bigram | Error | 4.3 | — | Unverified |
| Yelp Fine-grained classification | FastText | Error | 36.1 | — | Unverified |