SOTAVerified

Fine-Grained Named Entities for Corona News

2024-04-20Code Available0· sign in to hype

Sefika Efeoglu, Adrian Paschke

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Information resources such as newspapers have produced unstructured text data in various languages related to the corona outbreak since December 2019. Analyzing these unstructured texts is time-consuming without representing them in a structured format; therefore, representing them in a structured format is crucial. An information extraction pipeline with essential tasks -- named entity tagging and relation extraction -- to accomplish this goal might be applied to these texts. This study proposes a data annotation pipeline to generate training data from corona news articles, including generic and domain-specific entities. Named entity recognition models are trained on this annotated corpus and then evaluated on test sentences manually annotated by domain experts evaluating the performance of a trained model. The code base and demonstration are available at https://github.com/sefeoglu/coronanews-ner.git.

Tasks

Reproductions