SOTAVerified

ComSum: Commit Messages Summarization and Meaning Preservation

2021-08-23Code Available0· sign in to hype

Leshem Choshen, Idan Amit

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

We present ComSum, a data set of 7 million commit messages for text summarization. When documenting commits, software code changes, both a message and its summary are posted. We gather and filter those to curate developers' work summarization data set. Along with its growing size, practicality and challenging language domain, the data set benefits from the living field of empirical software engineering. As commits follow a typology, we propose to not only evaluate outputs by Rouge, but by their meaning preservation.

Tasks

Reproductions