Text Summarization with Pretrained Encoders
Yang Liu, Mirella Lapata
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/nlpyang/PreSummOfficialIn paperpytorch★ 1,302
- github.com/HHousen/TransformerSumpytorch★ 439
- github.com/nakhunchumpolsathien/TR-TPBSnone★ 29
- github.com/alebryvas/berk266pytorch★ 20
- github.com/nachotp/BertCommentSumpytorch★ 13
- github.com/olivia-fsm/p2mcqpytorch★ 10
- github.com/chesterdu/contrastive_summarypytorch★ 3
- github.com/manshri/tesumpytorch★ 1
- github.com/raqoon886/KoBertSumpytorch★ 0
- github.com/raqoon886/KorBertSumpytorch★ 0
Abstract
Bidirectional Encoder Representations from Transformers (BERT) represents the latest incarnation of pretrained language models which have recently advanced a wide range of natural language processing tasks. In this paper, we showcase how BERT can be usefully applied in text summarization and propose a general framework for both extractive and abstractive models. We introduce a novel document-level encoder based on BERT which is able to express the semantics of a document and obtain representations for its sentences. Our extractive model is built on top of this encoder by stacking several inter-sentence Transformer layers. For abstractive summarization, we propose a new fine-tuning schedule which adopts different optimizers for the encoder and the decoder as a means of alleviating the mismatch between the two (the former is pretrained while the latter is not). We also demonstrate that a two-staged fine-tuning approach can further boost the quality of the generated summaries. Experiments on three datasets show that our model achieves state-of-the-art results across the board in both extractive and abstractive settings. Our code is available at https://github.com/nlpyang/PreSumm
Tasks
Benchmark Results
| Dataset | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| CNN / Daily Mail | BertSumExtAbs | ROUGE-1 | 42.13 | — | Unverified |