Contributions to Clinical Named Entity Recognition in Portuguese

2019-08-01WS 2019Code Available0· sign in to hype

F{\'a}bio Lopes, C{\'e}sar Teixeira, Hugo Gon{\c{c}}alo Oliveira

Code Available — Be the first to reproduce this paper.

Code

github.com/fabioacl/PortugueseClinicalNER
OfficialIn papernone★ 16

Abstract

Having in mind that different languages might present different challenges, this paper presents the following contributions to the area of Information Extraction from clinical text, targeting the Portuguese language: a collection of 281 clinical texts in this language, with manually-annotated named entities; word embeddings trained in a larger collection of similar texts; results of using BiLSTM-CRF neural networks for named entity recognition on the annotated collection, including a comparison of using in-domain or out-of-domain word embeddings in this task. Although learned with much less data, performance is higher when using in-domain embeddings. When tested in 20 independent clinical texts, this model achieved better results than a model using larger out-of-domain embeddings.

Tasks

named-entity-recognition Named Entity Recognition Named Entity Recognition (NER)Word Embeddings

Contributions to Clinical Named Entity Recognition in Portuguese

Code

Abstract

Tasks

Reproductions