Entity Resolution

Entity resolution (also known as entity matching, record linkage, or duplicate detection) is the task of finding records that refer to the same real-world entity across different data sources (e.g., data files, books, websites, and databases). (Source: Wikipedia)

Surveys on entity resolution:

The task of entity resolution is closely related to the task of entity alignment which focuses on matching entities between knowledge bases. The task of entity linking differs from entity resolution as entity linking focuses on identifying entity mentions in free text.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–50 of 184 papers

Title	Date	Tasks	Status	Hype
Can Foundation Models Wrangle Your Data?	May 20, 2022	Entity ResolutionImputation	CodeCode Available	5
AutoBlock: A Hands-off Blocking Framework for Entity Matching	Dec 7, 2019	BlockingEntity Resolution	CodeCode Available	1
Entity Matching using Large Language Models	Oct 17, 2023	Data IntegrationEntity Resolution	CodeCode Available	1
Domain Adaptation for Deep Entity Resolution: A Design Space Exploration	Jun 1, 2022	Data IntegrationDomain Adaptation	CodeCode Available	1
Dual-Objective Fine-Tuning of BERT for Entity Matching	Jun 1, 2021	Data IntegrationEntity Resolution	CodeCode Available	1
Entity Resolution with Hierarchical Graph Attention Networks	Jun 1, 2022	AttributeEntity Resolution	CodeCode Available	1
Fine-tuning Large Language Models for Entity Matching	Sep 12, 2024	Data IntegrationEntity Resolution	CodeCode Available	1
PIZZA: A new benchmark for complex end-to-end task-oriented parsing	Dec 1, 2022	Entity Resolution	CodeCode Available	1
A Critical Re-evaluation of Neural Methods for Entity Alignment	Apr 1, 2022	Entity AlignmentEntity Resolution	CodeCode Available	1
Rotom: A Meta-Learned Data Augmentation Framework for Entity Matching, Data Cleaning, Text Classification, and Beyond	Jun 1, 2021	Data AugmentationEntity Resolution	CodeCode Available	1
Deep Entity Matching with Pre-Trained Language Models	Apr 1, 2020	Data AugmentationEntity Resolution	CodeCode Available	1
Cost-Effective In-Context Learning for Entity Resolution: A Design Space Exploration	Dec 7, 2023	Data IntegrationEntity Resolution	CodeCode Available	1
Deep Indexed Active Learning for Matching Heterogeneous Entity Representations	Apr 8, 2021	Active LearningBlocking	CodeCode Available	1
A Deep Learning Approach to Geographical Candidate Selection through Toponym Matching	Sep 17, 2020	Deep LearningEntity Resolution	CodeCode Available	1
Pre-trained Embeddings for Entity Resolution: An Experimental Analysis [Experiment, Analysis & Benchmark]	Apr 24, 2023	BlockingDeep Learning	CodeCode Available	1
Estimating the Performance of Entity Resolution Algorithms: Lessons Learned Through PatentsView.org	Oct 3, 2022	Entity Resolution	CodeCode Available	1
Intermediate Training of BERT for Product Matching	Aug 31, 2020	Entity ResolutionLanguage Modeling	CodeCode Available	1
Match, Compare, or Select? An Investigation of Large Language Models for Entity Matching	May 27, 2024	Entity Resolution	CodeCode Available	1
Supervised Contrastive Learning for Product Matching	Feb 4, 2022	Contrastive LearningData Augmentation	CodeCode Available	1
WDC Products: A Multi-Dimensional Entity Matching Benchmark	Jan 23, 2023	Contrastive LearningData Integration	CodeCode Available	1
Using ChatGPT for Entity Matching	May 5, 2023	Data IntegrationEntity Resolution	CodeCode Available	1
Unicorn: A Unified Multi-tasking Model for Supporting Matching Tasks in Data Integration	May 1, 2023	Data IntegrationEntity Resolution	CodeCode Available	1
A Practioner's Guide to Evaluating Entity Resolution Results	Sep 14, 2015	ClusteringEntity Resolution	CodeCode Available	1
How to Evaluate Entity Resolution Systems: An Entity-Centric Framework with Application to Inventor Name Disambiguation	Apr 8, 2024	Entity Resolution	CodeCode Available	1
Anaphora and Coreference Resolution: A Review	May 30, 2018	coreference-resolutionCoreference Resolution	—Unverified	0
Automated Metadata Harmonization Using Entity Resolution & Contextual Embedding	Oct 17, 2020	Entity Resolution	—Unverified	0
Automated Construction of a Knowledge Graph of Nuclear Fusion Energy for Effective Elicitation and Retrieval of Information	Apr 10, 2025	Entity Resolutionnamed-entity-recognition	—Unverified	0
Adaptive Candidate Generation for Scalable Edge-discovery Tasks on Data Graphs	May 2, 2016	BlockingEntity Resolution	—Unverified	0
Automatic Curation and Visualization of Crime Related Information from Incrementally Crawled Multi-source News Reports	Aug 1, 2018	ArticlesEntity Resolution	—Unverified	0
A Weak Self-supervision with Transition-Based Modeling for Reference Resolution	Nov 16, 2021	Entity Resolution	—Unverified	0
(Almost) All of Entity Resolution	Aug 10, 2020	AllClustering	—Unverified	0
Alleviating Poor Context with Background Knowledge for Named Entity Disambiguation	Aug 1, 2016	Entity DisambiguationEntity Linking	—Unverified	0
Author Name Disambiguation in Bibliographic Databases: A Survey	Apr 14, 2020	ClusteringEntity Resolution	—Unverified	0
Complex and Holographic Embeddings of Knowledge Graphs: A Comparison	Jul 5, 2017	ArticlesEntity Resolution	—Unverified	0
A Three-Way Model for Collective Learning on Multi-Relational Data	Jan 1, 2011	Entity ResolutionRelational Reasoning	—Unverified	0
Em-K Indexing for Approximate Query Matching in Large-scale ER	Nov 7, 2021	BlockingEntity Resolution	—Unverified	0
End-to-End Entity Resolution and Question Answering Using Differentiable Knowledge Graphs	Sep 13, 2021	Entity ResolutionKnowledge Graphs	—Unverified	0
A Theoretical Analysis of First Heuristics of Crowdsourced Entity Resolution	Feb 3, 2017	Entity Resolution	—Unverified	0
A Survey on Efficient Processing of Similarity Queries over Neural Embeddings	Apr 17, 2022	Entity ResolutionInformation Retrieval	—Unverified	0
Aleda, a free large-scale entity database for French	May 1, 2012	Entity LinkingEntity Resolution	—Unverified	0
Clustering with Fast, Automated and Reproducible assessment applied to longitudinal neural tracking	Mar 19, 2020	ClusteringEntity Resolution	—Unverified	0
Clustering Via Crowdsourcing	Apr 7, 2016	ClusteringEntity Resolution	—Unverified	0
Clustering with Noisy Queries	Jun 22, 2017	ClusteringEntity Resolution	—Unverified	0
Collective Entity Resolution with Multi-Focal Attention	Aug 1, 2016	Entity ResolutionTopic Models	—Unverified	0
Combining Data-driven Supervision with Human-in-the-loop Feedback for Entity Resolution	Nov 20, 2021	Entity Resolution	—Unverified	0
Combining Global and Local Merges in Logic-based Entity Resolution	May 26, 2023	Entity Resolution	—Unverified	0
A Study on Entity Resolution for Email Conversations	May 1, 2020	Entity Resolution	—Unverified	0
Concept Identification of Directly and Indirectly Related Mentions Referring to Groups of Persons	Jul 2, 2021	ArticlesClustering	—Unverified	0
CorDEL: A Contrastive Deep Learning Approach for Entity Linkage	Sep 15, 2020	Deep LearningEntity Resolution	—Unverified	0
Clustering on the Edge: Learning Structure in Graphs	May 5, 2016	ClusteringEntity Resolution	—Unverified	0

Show:10 25 50

← PrevPage 1 of 4Next →

All datasets Amazon-Google Abt-Buy WDC Products-80%cc-seen-medium WDC Computers-small WDC Computers-xlarge WDC Products-50%cc-unseen-medium WDC Watches-small MusicBrainz20K WDC Watches-xlarge WDC Products-80%cc-seen-medium-multi WDC Products

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	gpt4-0613_fewshot-10	F1 (%)	85.21	—	Unverified
2	gpt-4o-mini-2024-07-18_fine_tuned	F1 (%)	80.25	—	Unverified
3	RoBERTa-SupCon	F1 (%)	79.28	—	Unverified
4	RobEM	F1 (%)	79.06	—	Unverified
5	Random Forest	F1 (%)	79	—	Unverified
6	HG	F1 (%)	76.4	—	Unverified
7	Ditto	F1 (%)	75.58	—	Unverified
8	CorDEL-Sum	F1 (%)	70.2	—	Unverified
9	DeepMatcher - Hybrid	F1 (%)	69.3	—	Unverified
10	D-HAT	F1 (%)	67.5	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	gpt4-0613_zeroshot	F1 (%)	95.78	—	Unverified
2	RoBERTa-SupCon	F1 (%)	94.29	—	Unverified
3	gpt-4o-mini-2024-07-18_fine_tuned	F1 (%)	94.09	—	Unverified
4	gpt-4o-2024-08-06	F1 (%)	92.2	—	Unverified
5	RobEM	F1 (%)	90.9	—	Unverified
6	HG	F1 (%)	89.8	—	Unverified
7	Ditto	F1 (%)	89.33	—	Unverified
8	gpt-4o-mini-2024-07-18	F1 (%)	87.68	—	Unverified
9	Meta-Llama-3.1-8B-Instruct_fine_tuned	F1 (%)	87.34	—	Unverified
10	Random Forest	F1 (%)	85	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	gpt4-0613_zeroshot	F1 (%)	89.61	—	Unverified
2	gpt-4o-2024-08-06_fine_tuned_wdc_small	F1 (%)	87.1	—	Unverified
3	gpt-4o-mini-2024-07-18_structured_explanations	F1 (%)	84.38	—	Unverified
4	gpt-4o-mini-2024-07-18	F1 (%)	81.61	—	Unverified
5	RoBERTa-SupCon	F1 (%)	79.99	—	Unverified
6	Llama3.1_70B_structured_explanations	F1 (%)	76.7	—	Unverified
7	Llama3.1_70B	F1 (%)	75.2	—	Unverified
8	Llama3.1_8B_error-based_example_selection	F1 (%)	74.37	—	Unverified
9	Llama3.1_8B_structured_explanations	F1 (%)	74.13	—	Unverified
10	Ditto	F1 (%)	73.93	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	BERT	F1 (%)	96.53	—	Unverified
2	RoBERTa-SupCon	F1 (%)	95.21	—	Unverified
3	HG	F1 (%)	88.5	—	Unverified
4	DADER-MMD	F1 (%)	88	—	Unverified
5	Ditto	F1 (%)	80.76	—	Unverified
6	JointBERT	F1 (%)	77.55	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	RoBERTa-SupCon	F1 (%)	98.33	—	Unverified
2	JointBERT	F1 (%)	97.49	—	Unverified
3	BERT	F1 (%)	97.37	—	Unverified
4	HG	F1 (%)	96.5	—	Unverified
5	Ditto	F1 (%)	95.45	—	Unverified
6	Random Forest	F1 (%)	78	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	RoBERTa-base	F1 (%)	71.14	—	Unverified
2	Ditto	F1 (%)	70.66	—	Unverified
3	HG	F1 (%)	68.74	—	Unverified
4	RoBERTa-SupCon	F1 (%)	57.23	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	HG	F1 (%)	94	—	Unverified
2	DADER-NoDA	F1 (%)	88.6	—	Unverified
3	Ditto	F1 (%)	85.12	—	Unverified
4	JointBERT	F1 (%)	75.83	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ALMSER-GB	F1	0.95	—	Unverified
2	FAMER-SplitMerge	F1	0.88	—	Unverified
3	FAMER-Split	F1	0.84	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	JointBERT	F1 (%)	97.09	—	Unverified
2	Ditto	F1 (%)	96.53	—	Unverified
3	HG	F1 (%)	96.5	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	RoBERTa-SupCon	F1 Micro	88.63	—	Unverified
2	RoBERTa-base	F1 Micro	52.03	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	gpt-4o-2024-08-06_fine_tuned_wdc_small	F1 (%)	87.07	—	Unverified