Extracting Discriminative Keyphrases with Learned Semantic Hierarchies

2016-12-01COLING 2016Unverified0· sign in to hype

Yunli Wang, Yong Jin, Xiaodan Zhu, Cyril Goutte

Unverified — Be the first to reproduce this paper.

Abstract

The goal of keyphrase extraction is to automatically identify the most salient phrases from documents. The technique has a wide range of applications such as rendering a quick glimpse of a document, or extracting key content for further use. While previous work often assumes keyphrases are a static property of a given documents, in many applications, the appropriate set of keyphrases that should be extracted depends on the set of documents that are being considered together. In particular, good keyphrases should not only accurately describe the content of a document, but also reveal what discriminates it from the other documents. In this paper, we study this problem of extracting discriminative keyphrases. In particularly, we propose to use the hierarchical semantic structure between candidate keyphrases to promote keyphrases that have the right level of specificity to clearly distinguish the target document from others. We show that such knowledge can be used to construct better discriminative keyphrase extraction systems that do not assume a static, fixed set of keyphrases for a document. We show how this helps identify key expertise of authors from their papers, as well as competencies covered by online courses within different domains.

Tasks

Keyphrase Extraction Specificity

Extracting Discriminative Keyphrases with Learned Semantic Hierarchies

Abstract

Tasks

Reproductions