SOTAVerified

Evaluation of a Sequence Tagging Tool for Biomedical Texts

2018-10-01WS 2018Code Available0· sign in to hype

Julien Tourille, Matthieu Doutreligne, Olivier Ferret, Aur{\'e}lie N{\'e}v{\'e}ol, Nicolas Paris, Xavier Tannier

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Many applications in biomedical natural language processing rely on sequence tagging as an initial step to perform more complex analysis. To support text analysis in the biomedical domain, we introduce Yet Another SEquence Tagger (YASET), an open-source multi purpose sequence tagger that implements state-of-the-art deep learning algorithms for sequence tagging. Herein, we evaluate YASET on part-of-speech tagging and named entity recognition in a variety of text genres including articles from the biomedical literature in English and clinical narratives in French. To further characterize performance, we report distributions over 30 runs and different sizes of training datasets. YASET provides state-of-the-art performance on the CoNLL 2003 NER dataset (F1=0.87), MEDPOST corpus (F1=0.97), MERLoT corpus (F1=0.99) and NCBI disease corpus (F1=0.81). We believe that YASET is a versatile and efficient tool that can be used for sequence tagging in biomedical and clinical texts.

Tasks

Reproductions