On Extractive and Abstractive Neural Document Summarization with Transformer Language Models
Sandeep Subramanian, Raymond Li, Jonathan Pilault, Christopher Pal
Code Available — Be the first to reproduce this paper.
ReproduceCode
Abstract
We present a method to produce abstractive summaries of long documents that exceed several thousand words via neural abstractive summarization. We perform a simple extractive step before generating a summary, which is then used to condition the transformer language model on relevant information before being tasked with generating a summary. We show that this extractive step significantly improves summarization results. We also show that this approach produces more abstractive summaries compared to prior work that employs a copy mechanism while still achieving higher rouge scores. Note: The abstract above was not written by the authors, it was generated by one of the models presented in this paper.
Tasks
Benchmark Results
| Dataset | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| Arxiv HEP-TH citation graph | TLM-I+E | ROUGE-1 | 42.43 | — | Unverified |
| Arxiv HEP-TH citation graph | Sent-PTR | ROUGE-1 | 42.32 | — | Unverified |
| Arxiv HEP-TH citation graph | Sent-CLF | ROUGE-1 | 34.01 | — | Unverified |
| Pubmed | Sent-CLF | ROUGE-1 | 45.01 | — | Unverified |
| Pubmed | Sent-PTR | ROUGE-1 | 43.3 | — | Unverified |
| Pubmed | TLM-I+E | ROUGE-1 | 41.43 | — | Unverified |