Simple yet Powerful: An Overlooked Architecture for Nested Named Entity Recognition

2021-11-16ACL ARR November 2021Unverified0· sign in to hype

Anonymous

Unverified — Be the first to reproduce this paper.

Abstract

Named Entity Recognition (NER) is an important task in Natural Language Processing that aims to identify text spans belonging to predefined categories. Traditional NER research ignores nested entities, which are entities contained in other entity mentions. Although several methods have been proposed to address this case, most of them rely on complex task-specific structures and ignore potentially useful baselines for the task. We argue that this creates an overly optimistic impression of their performance. This paper revisits the Multiple LSTM-CRF (MLC) model, a simple, overlooked, yet powerful approach based on training independent sequence labeling models for each entity type. Extensive experiments with three nested NER corpora show that, regardless of the simplicity of this model, its performance is better or at least as well as more sophisticated methods. Furthermore, we show that the MLC architecture achieves state-of-the-art results in the Chilean Waiting List corpus by including pre-trained language models. In addition, we propose new task-specific metrics that adequately measure the ability of models to detect nestings. The results show that standard NER metrics do not measure well the ability of a model to detect nested entities, while our task-specific metrics provide new evidence on how existing approaches handle the task.

Tasks

named-entity-recognition Named Entity Recognition Named Entity Recognition (NER)NER Nested Named Entity Recognition

Simple yet Powerful: An Overlooked Architecture for Nested Named Entity Recognition

Abstract

Tasks

Reproductions