LipKey: A Large-Scale News Dataset for Absent Keyphrases Generation and Abstractive Summarization
2022-10-01COLING 2022Unverified0· sign in to hype
Fajri Koto, Timothy Baldwin, Jey Han Lau
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
Summaries, keyphrases, and titles are different ways of concisely capturing the content of a document. While most previous work has released the datasets of keyphrases and summarization separately, in this work, we introduce LipKey, the largest news corpus with human-written abstractive summaries, absent keyphrases, and titles. We jointly use the three elements via multi-task training and training as joint structured inputs, in the context of document summarization. We find that including absent keyphrases and titles as additional context to the source document improves transformer-based summarization models.