BillSum: A Corpus for Automatic Summarization of US Legislation

2019-10-01WS 2019Code Available0· sign in to hype

Anastassia Kornilova, Vlad Eidelman

Code Available — Be the first to reproduce this paper.

Code

github.com/FiscalNote/BillSum
OfficialIn papertf★ 0
github.com/allisontam/bills867
none★ 0

Abstract

Automatic summarization methods have been studied on a variety of domains, including news and scientific articles. Yet, legislation has not previously been considered for this task, despite US Congress and state governments releasing tens of thousands of bills every year. In this paper, we introduce BillSum, the first dataset for summarization of US Congressional and California state bills (https://github.com/FiscalNote/BillSum). We explain the properties of the dataset that make it more challenging to process than other domains. Then, we benchmark extractive methods that consider neural sentence representations and traditional contextual features. Finally, we demonstrate that models built on Congressional bills can be used to summarize California bills, thus, showing that methods developed on this dataset can transfer to states without human-written summaries.

Tasks

Articles Sentence Text Summarization

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
BillSum	Longformer Encoder Decoder	rouge1	38.65	—	Unverified

BillSum: A Corpus for Automatic Summarization of US Legislation

Code

Abstract

Tasks

Benchmark Results

Reproductions