Generating Scientific Definitions with Controllable Complexity
Tal August, Katharina Reinecke, Noah Smith
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/talaugust/definition-complexityOfficialIn paperpytorch★ 14
Abstract
Unfamiliar terminology and complex language can present barriers to understanding science. Natural language processing stands to help address these issues by automatically defining unfamiliar terms. We introduce a new task and dataset for defining scientific terms and controlling the complexity of generated definitions as a way of adapting to a specific reader’s background knowledge. We test four definition generation methods for this new task, finding that a sequence-to-sequence approach is most successful. We then explore the version of the task in which definitions are generated at a target complexity level. We introduce a novel reranking approach and find in human evaluations that it offers superior fluency while also controlling complexity, compared to several controllable generation baselines.