SOTAVerified

LABDA at SemEval-2017 Task 10: Extracting Keyphrases from Scientific Publications by combining the BANNER tool and the UMLS Semantic Network

2017-08-01SEMEVAL 2017Unverified0· sign in to hype

Isabel Segura-Bedmar, Crist{\'o}bal Col{\'o}n-Ruiz, Paloma Mart{\'\i}nez

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

This paper describes the system presented by the LABDA group at SemEval 2017 Task 10 ScienceIE, specifically for the subtasks of identification and classification of keyphrases from scientific articles. For the task of identification, we use the BANNER tool, a named entity recognition system, which is based on conditional random fields (CRF) and has obtained successful results in the biomedical domain. To classify keyphrases, we study the UMLS semantic network and propose a possible linking between the keyphrase types and the UMLS semantic groups. Based on this semantic linking, we create a dictionary for each keyphrase type. Then, a feature indicating if a token is found in one of these dictionaries is incorporated to feature set used by the BANNER tool. The final results on the test dataset show that our system still needs to be improved, but the conditional random fields and, consequently, the BANNER system can be used as a first approximation to identify and classify keyphrases.

Tasks

Reproductions