SOTAVerified

Extractive Multi-document Summarization using K-means, Centroid-based Method, MMR, and Sentence Position

2019-12-04The Tenth International Symposium 2019Code Available0· sign in to hype

Hai Cao Manh, Huong Le Thanh, Tuan Luu Minh

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Multi-document summarization is more challenging than single-document summarization since it has to solve the problem of overlapping information among sentences from different documents. Also, since multi-document summarization dataset is rare, methods based on deep learning are difficult to be applied. In this paper, we propose an approach to multi-document summarization based on a k-means clustering algorithm, combining with the centroid-based method, maximal marginal relevance, and sentence positions. This approach is efficient in finding salient sentences and preventing overlapping between sentences. Experiments using DUC 2007 dataset show that our system is more efficient than other researchers in this field.

Tasks

Reproductions