Few-NERD: A Few-Shot Named Entity Recognition Dataset

2021-05-16ACL 2021Code Available1· sign in to hype

Ning Ding, Guangwei Xu, Yulin Chen, Xiaobin Wang, Xu Han, Pengjun Xie, Hai-Tao Zheng, Zhiyuan Liu

Code Available — Be the first to reproduce this paper.

Code

github.com/thunlp/Few-NERD
OfficialIn paperpytorch★ 398
github.com/psunlpgroup/container
pytorch★ 118
github.com/katzurik/neretrieve
none★ 31
github.com/wangpeiyi9979/esd
pytorch★ 28
github.com/renll/sparselt
pytorch★ 14
github.com/zifengcheng/cdap
pytorch★ 2
github.com/microsoft/vert-papers/tree/master/papers/DecomposedMetaNER
pytorch★ 0

Abstract

Recently, considerable literature has grown up around the theme of few-shot named entity recognition (NER), but little published benchmark data specifically focused on the practical and challenging task. Current approaches collect existing supervised NER datasets and re-organize them to the few-shot setting for empirical study. These strategies conventionally aim to recognize coarse-grained entity types with few examples, while in practice, most unseen entity types are fine-grained. In this paper, we present Few-NERD, a large-scale human-annotated few-shot NER dataset with a hierarchy of 8 coarse-grained and 66 fine-grained entity types. Few-NERD consists of 188,238 sentences from Wikipedia, 4,601,160 words are included and each is annotated as context or a part of a two-level entity type. To the best of our knowledge, this is the first few-shot NER dataset and the largest human-crafted NER dataset. We construct benchmark tasks with different emphases to comprehensively assess the generalization capability of models. Extensive empirical results and analysis show that Few-NERD is challenging and the problem requires further research. We make Few-NERD public at https://ningding97.github.io/fewnerd/.

Tasks

Few-shot NER Named Entity Recognition Named Entity Recognition (NER)

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
Few-NERD (SUP)	BERT-Tagger	F1-Measure	67.13	—	Unverified

Few-NERD: A Few-Shot Named Entity Recognition Dataset

Code

Abstract

Tasks

Benchmark Results

Reproductions