Generating Wikipedia by Summarizing Long Sequences
Peter J. Liu, Mohammad Saleh, Etienne Pot, Ben Goodrich, Ryan Sepassi, Lukasz Kaiser, Noam Shazeer
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/tensorflow/tensor2tensorOfficialIn papertf★ 17,096
- github.com/aseidelo/wiki_generatortf★ 5
- github.com/brsarah20/Alphafold2pytorch★ 2
- github.com/lucidrains/memory-compressed-attentionpytorch★ 0
Abstract
We show that generating English Wikipedia articles can be approached as a multi- document summarization of source documents. We use extractive summarization to coarsely identify salient information and a neural abstractive model to generate the article. For the abstractive model, we introduce a decoder-only architecture that can scalably attend to very long sequences, much longer than typical encoder- decoder architectures used in sequence transduction. We show that this model can generate fluent, coherent multi-sentence paragraphs and even whole Wikipedia articles. When given reference documents, we show it can extract relevant factual information as reflected in perplexity, ROUGE scores and human evaluations.