Named Entity Recognition (NER)

Named Entity Recognition (NER) is a task of Natural Language Processing (NLP) that involves identifying and classifying named entities in a text into predefined categories such as person names, organizations, locations, and others. The goal of NER is to extract structured information from unstructured text data and represent it in a machine-readable format. Approaches typically use BIO notation, which differentiates the beginning (B) and the inside (I) of entities. O is used for non-entity tokens.

Example:

| Mark | Watney | visited | Mars | | --- | ---| --- | --- | | B-PER | I-PER | O | B-LOC |

( Image credit: Zalando )

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 251–275 of 2874 papers

Title	Date	Tasks	Status	Hype	Score
DeepStruct: Pretraining of Language Models for Structure Prediction	May 21, 2022	coreference-resolutionCoreference Resolution	CodeCode Available	1	5
Deep Span Representations for Named Entity Recognition	Oct 9, 2022	named-entity-recognitionNamed Entity Recognition	CodeCode Available	1	5
A general-purpose material property data extraction pipeline from large polymer corpora using Natural Language Processing	Sep 27, 2022	ArticlesLanguage Modeling	CodeCode Available	1	5
DeID-GPT: Zero-shot Medical Text De-Identification by GPT-4	Mar 20, 2023	BenchmarkingDe-identification	CodeCode Available	1	5
ELIT: Emory Language and Information Toolkit	Sep 8, 2021	AMR ParsingConstituency Parsing	CodeCode Available	1	5
Domain-Specific NER via Retrieving Correlated Samples	Aug 27, 2022	Named Entity RecognitionNamed Entity Recognition (NER)	CodeCode Available	1	5
Supplementary Features of BiLSTM for Enhanced Sequence Labeling	May 31, 2023	Aspect-Based Sentiment AnalysisChinese Named Entity Recognition	CodeCode Available	1	5
A Simple but Effective Approach to Improve Structured Language Model Output for Information Extraction	Feb 20, 2024	Language ModelingLanguage Modelling	CodeCode Available	1	5
Do Syntax Trees Help Pre-trained Transformers Extract Information?	Aug 20, 2020	Graph Neural Networknamed-entity-recognition	CodeCode Available	1	5
Empirical Analysis of Unlabeled Entity Problem in Named Entity Recognition	Dec 10, 2020	named-entity-recognitionNamed Entity Recognition	CodeCode Available	1	5
Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training	Sep 10, 2021	Language ModelingLanguage Modelling	CodeCode Available	1	5
AraELECTRA: Pre-Training Text Discriminators for Arabic Language Understanding	Dec 31, 2020	Language ModelingLanguage Modelling	CodeCode Available	1	5
Distantly Supervised Named Entity Recognition via Confidence-Based Multi-Class Positive and Unlabeled Learning	Mar 3, 2022	named-entity-recognitionNamed Entity Recognition	CodeCode Available	1	5
A Sequence-to-Set Network for Nested Named Entity Recognition	May 19, 2021	Decodernamed-entity-recognition	CodeCode Available	1	5
Annotating the Tweebank Corpus on Named Entity Recognition and Building NLP Models for Social Media Analysis	Jan 18, 2022	Dependency Parsingnamed-entity-recognition	CodeCode Available	1	5
Domain specific BERT representation for Named Entity Recognition of lab protocol	Dec 21, 2020	named-entity-recognitionNamed Entity Recognition	CodeCode Available	1	5
DWIE: an entity-centric dataset for multi-task document-level information extraction	Sep 26, 2020	coreference-resolutionCoreference Resolution	CodeCode Available	1	5
Earnings-21: A Practical Benchmark for ASR in the Wild	Apr 22, 2021	named-entity-recognitionNamed Entity Recognition	CodeCode Available	1	5
Efficient Test Time Adapter Ensembling for Low-resource Language Varieties	Sep 10, 2021	Cross-Lingual Transfernamed-entity-recognition	CodeCode Available	1	5
A Span-Based Model for Joint Overlapped and Discontinuous Named Entity Recognition	Jun 28, 2021	named-entity-recognitionNamed Entity Recognition	CodeCode Available	1	5
AraBERT: Transformer-based Model for Arabic Language Understanding	Feb 28, 2020	modelnamed-entity-recognition	CodeCode Available	1	5
ATCO2 corpus: A Large-Scale Dataset for Research on Automatic Speech Recognition and Natural Language Understanding of Air Traffic Control Communications	Nov 8, 2022	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	CodeCode Available	1	5
ACLM: A Selective-Denoising based Generative Data Augmentation Approach for Low-Resource Complex NER	Jun 1, 2023	Data AugmentationDenoising	CodeCode Available	1	5
End-to-End Chinese Speaker Identification	Jul 1, 2022	coreference-resolutionCoreference Resolution	CodeCode Available	1	5
DiscoverPath: A Knowledge Refinement and Retrieval System for Interdisciplinarity on Biomedical Research	Sep 4, 2023	Articlesnamed-entity-recognition	CodeCode Available	1	5

Show:10 25 50

← PrevPage 11 of 115Next →

All datasets CoNLL 2003 (English)Ontonotes v5 (English)NCBI Disease WNUT 2017 ACE 2005 JNLPBA BC5CDR GENIA BC2GM BC5CDR-chemical SLUE CoNLL++

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	ACE + document-context	F1	94.6	—	Unverified
2	LUKE 483M	F1	94.3	—	Unverified
3	Co-regularized LUKE	F1	94.22	—	Unverified
4	LUKE + SubRegWeigh (K-means)	F1	94.2	—	Unverified
5	ASP+T5-3B	F1	94.1	—	Unverified
6	FLERT XLM-R	F1	94.09	—	Unverified
7	PL-Marker	F1	94	—	Unverified
8	CL-KL	F1	93.85	—	Unverified
9	XLNet-GCN	F1	93.82	—	Unverified
10	RoBERTa + SubRegWeigh (K-means)	F1	93.81	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	BERT-MRC+DSC	F1	92.07	—	Unverified
2	PL-Marker	F1	91.9	—	Unverified
3	Baseline + BS	F1	91.74	—	Unverified
4	Biaffine-NER	F1	91.3	—	Unverified
5	BERT-MRC	F1	91.11	—	Unverified
6	PIQN	F1	90.96	—	Unverified
7	HGN	F1	90.92	—	Unverified
8	Syn-LSTM + BERT (wo doc-context)	F1	90.85	—	Unverified
9	DiffusionNER	F1	90.66	—	Unverified
10	W2NER	F1	90.5	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	BioBERT	F1	89.71	—	Unverified
2	SpanModel + SequenceLabelingModel	F1	89.6	—	Unverified
3	SciFive-Base	F1	89.39	—	Unverified
4	BLSTM-CNN-Char (SparkNLP)	F1	89.13	—	Unverified
5	Spark NLP	F1	89.13	—	Unverified
6	KeBioLM	F1	89.1	—	Unverified
7	CL-KL	F1	88.96	—	Unverified
8	BioKMNER + BioBERT	F1	88.77	—	Unverified
9	BioLinkBERT (large)	F1	88.76	—	Unverified
10	CompactBioBERT	F1	88.67	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	CL-KL	F1	60.45	—	Unverified
2	RoBERTa + SubRegWeigh (K-means)	F1	60.29	—	Unverified
3	BERT-CRF (Replicated in AdaSeq)	F1	59.69	—	Unverified
4	RoBERTa-BiLSTM-context	F1	59.61	—	Unverified
5	BERT + RegLER	F1	58.9	—	Unverified
6	TNER -xlm-r-large	F1	58.5	—	Unverified
7	HGN	F1	57.41	—	Unverified
8	ASA + RoBERTa	F1	57.3	—	Unverified
9	BERTweet	F1	56.5	—	Unverified
10	MINER	F1	54.86	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	Ours: cross-sentence ALB	F1	90.9	—	Unverified
2	GoLLIE	F1	89.6	—	Unverified
3	PromptNER [RoBERTa-large]	F1	88.26	—	Unverified
4	PIQN	F1	87.42	—	Unverified
5	PromptNER [BERT-large]	F1	87.21	—	Unverified
6	DiffusionNER	F1	86.93	—	Unverified
7	BERT-MRC	F1	86.88	—	Unverified
8	UniNER-7B	F1	86.69	—	Unverified
9	Locate and Label	F1	86.67	—	Unverified
10	BoningKnife	F1	85.46	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	KeBioLM	F1	82	—	Unverified
2	BLSTM-CNN-Char (SparkNLP)	F1	81.29	—	Unverified
3	Spark NLP	F1	81.29	—	Unverified
4	BINDER	F1	80.3	—	Unverified
5	BioMobileBERT	F1	80.13	—	Unverified
6	BioLinkBERT (large)	F1	80.06	—	Unverified
7	DistilBioBERT	F1	79.97	—	Unverified
8	CompactBioBERT	F1	79.88	—	Unverified
9	BioDistilBERT	F1	79.1	—	Unverified
10	PubMedBERT uncased	F1	79.1	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	BINDER	F1	91.9	—	Unverified
2	ConNER	F1	91.3	—	Unverified
3	CL-L2	F1	90.99	—	Unverified
4	aimped	F1	90.95	—	Unverified
5	BertForTokenClassification (Spark NLP)	F1	90.89	—	Unverified
6	BioLinkBERT (large)	F1	90.22	—	Unverified
7	ELECTRAMed	F1	90.03	—	Unverified
8	Spark NLP	F1	89.73	—	Unverified
9	BLSTM-CNN-Char (SparkNLP)	F1	89.73	—	Unverified
10	UniNER-7B	F1	89.34	—	Unverified