How to Fine-Tune BERT for Text Classification?
2019-05-14Code Available1· sign in to hype
Chi Sun, Xipeng Qiu, Yige Xu, Xuanjing Huang
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/xuyige/BERT4doc-ClassificationOfficialIn paperpytorch★ 0
- github.com/GeorgeLuImmortal/Hierarchical-BERT-Model-with-Limited-Labelled-Datapytorch★ 42
- github.com/heraclex12/VLSP2020-Fake-News-Detectionpytorch★ 18
- github.com/bcaitech1/p4-dkt-no_caffeine_no_gainpytorch★ 16
- github.com/Derposoft/ai-educatornone★ 2
- github.com/helmy-elrais/RoBERT_Recurrence_over_BERTpytorch★ 0
- github.com/soarsmu/BiasFinderpytorch★ 0
- github.com/arctic-yen/Google_QUEST_Q-A_Labelingtf★ 0
- github.com/sahil00199/KYCpytorch★ 0
- github.com/Domminique/Deploy-BERT-for-Sentiment-Analysis-with-FastAPI-pytorch★ 0
Abstract
Language model pre-training has proven to be useful in learning universal language representations. As a state-of-the-art language model pre-training model, BERT (Bidirectional Encoder Representations from Transformers) has achieved amazing results in many language understanding tasks. In this paper, we conduct exhaustive experiments to investigate different fine-tuning methods of BERT on text classification task and provide a general solution for BERT fine-tuning. Finally, the proposed solution obtains new state-of-the-art results on eight widely-studied text classification datasets.
Tasks
Benchmark Results
| Dataset | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| IMDb | BERT_large+ITPT | Accuracy | 95.79 | — | Unverified |
| IMDb | BERT_base+ITPT | Accuracy | 95.63 | — | Unverified |
| Yelp Binary classification | BERT_large+ITPT | Error | 1.81 | — | Unverified |
| Yelp Binary classification | BERT_base+ITPT | Error | 1.92 | — | Unverified |
| Yelp Fine-grained classification | BERT_large+ITPT | Error | 28.62 | — | Unverified |
| Yelp Fine-grained classification | BERT_base+ITPT | Error | 29.42 | — | Unverified |