SOTAVerified

Pooled Contextualized Embeddings for Named Entity Recognition

2019-06-01NAACL 2019Unverified0· sign in to hype

Alan Akbik, Tanja Bergmann, Rol Vollgraf,

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

Contextual string embeddings are a recent type of contextualized word embedding that were shown to yield state-of-the-art results when utilized in a range of sequence labeling tasks. They are based on character-level language models which treat text as distributions over characters and are capable of generating embeddings for any string of characters within any textual context. However, such purely character-based approaches struggle to produce meaningful embeddings if a rare string is used in a underspecified context. To address this drawback, we propose a method in which we dynamically aggregate contextualized embeddings of each unique string that we encounter. We then use a pooling operation to distill a ''global'' word representation from all contextualized instances. We evaluate these ''pooled contextualized embeddings'' on common named entity recognition (NER) tasks such as CoNLL-03 and WNUT and show that our approach significantly improves the state-of-the-art for NER. We make all code and pre-trained models available to the research community for use and reproduction.

Tasks

Reproductions