Entity Resolution

Entity resolution (also known as entity matching, record linkage, or duplicate detection) is the task of finding records that refer to the same real-world entity across different data sources (e.g., data files, books, websites, and databases). (Source: Wikipedia)

Surveys on entity resolution:

The task of entity resolution is closely related to the task of entity alignment which focuses on matching entities between knowledge bases. The task of entity linking differs from entity resolution as entity linking focuses on identifying entity mentions in free text.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–50 of 184 papers

Title	Date	Tasks	Status	Hype	Score
Can Foundation Models Wrangle Your Data?	May 20, 2022	Entity ResolutionImputation	CodeCode Available	5	5
AutoBlock: A Hands-off Blocking Framework for Entity Matching	Dec 7, 2019	BlockingEntity Resolution	CodeCode Available	1	5
How to Evaluate Entity Resolution Systems: An Entity-Centric Framework with Application to Inventor Name Disambiguation	Apr 8, 2024	Entity Resolution	CodeCode Available	1	5
Intermediate Training of BERT for Product Matching	Aug 31, 2020	Entity ResolutionLanguage Modeling	CodeCode Available	1	5
Rotom: A Meta-Learned Data Augmentation Framework for Entity Matching, Data Cleaning, Text Classification, and Beyond	Jun 1, 2021	Data AugmentationEntity Resolution	CodeCode Available	1	5
PIZZA: A new benchmark for complex end-to-end task-oriented parsing	Dec 1, 2022	Entity Resolution	CodeCode Available	1	5
Match, Compare, or Select? An Investigation of Large Language Models for Entity Matching	May 27, 2024	Entity Resolution	CodeCode Available	1	5
Entity Matching using Large Language Models	Oct 17, 2023	Data IntegrationEntity Resolution	CodeCode Available	1	5
Deep Indexed Active Learning for Matching Heterogeneous Entity Representations	Apr 8, 2021	Active LearningBlocking	CodeCode Available	1	5
A Critical Re-evaluation of Neural Methods for Entity Alignment	Apr 1, 2022	Entity AlignmentEntity Resolution	CodeCode Available	1	5
Deep Entity Matching with Pre-Trained Language Models	Apr 1, 2020	Data AugmentationEntity Resolution	CodeCode Available	1	5
Cost-Effective In-Context Learning for Entity Resolution: A Design Space Exploration	Dec 7, 2023	Data IntegrationEntity Resolution	CodeCode Available	1	5
Supervised Contrastive Learning for Product Matching	Feb 4, 2022	Contrastive LearningData Augmentation	CodeCode Available	1	5
A Deep Learning Approach to Geographical Candidate Selection through Toponym Matching	Sep 17, 2020	Deep LearningEntity Resolution	CodeCode Available	1	5
Entity Resolution with Hierarchical Graph Attention Networks	Jun 1, 2022	AttributeEntity Resolution	CodeCode Available	1	5
WDC Products: A Multi-Dimensional Entity Matching Benchmark	Jan 23, 2023	Contrastive LearningData Integration	CodeCode Available	1	5
Fine-tuning Large Language Models for Entity Matching	Sep 12, 2024	Data IntegrationEntity Resolution	CodeCode Available	1	5
Estimating the Performance of Entity Resolution Algorithms: Lessons Learned Through PatentsView.org	Oct 3, 2022	Entity Resolution	CodeCode Available	1	5
Dual-Objective Fine-Tuning of BERT for Entity Matching	Jun 1, 2021	Data IntegrationEntity Resolution	CodeCode Available	1	5
Domain Adaptation for Deep Entity Resolution: A Design Space Exploration	Jun 1, 2022	Data IntegrationDomain Adaptation	CodeCode Available	1	5
A Practioner's Guide to Evaluating Entity Resolution Results	Sep 14, 2015	ClusteringEntity Resolution	CodeCode Available	1	5
Using ChatGPT for Entity Matching	May 5, 2023	Data IntegrationEntity Resolution	CodeCode Available	1	5
Pre-trained Embeddings for Entity Resolution: An Experimental Analysis [Experiment, Analysis & Benchmark]	Apr 24, 2023	BlockingDeep Learning	CodeCode Available	1	5
Unicorn: A Unified Multi-tasking Model for Supporting Matching Tasks in Data Integration	May 1, 2023	Data IntegrationEntity Resolution	CodeCode Available	1	5
SC-Block: Supervised Contrastive Blocking within Entity Resolution Pipelines	Mar 6, 2023	BlockingContrastive Learning	CodeCode Available	0	5
Probing the Robustness of Pre-trained Language Models for Entity Matching	Oct 1, 2022	Data AugmentationDomain Generalization	CodeCode Available	0	5
Profiling Entity Matching Benchmark Tasks	Oct 19, 2020	Data IntegrationEntity Resolution	CodeCode Available	0	5
ZeroER: Entity Resolution using Zero Labeled Examples	Aug 16, 2019	Entity Resolution	CodeCode Available	0	5
Analyzing how BERT performs entity matching	Apr 1, 2022	Entity ResolutionSemantic Similarity	CodeCode Available	0	5
Learning Text Representations for 500K Classification Tasks on Named Entity Disambiguation	Oct 1, 2018	AllData Augmentation	CodeCode Available	0	5
In Search of an Entity Resolution OASIS: Optimal Asymptotic Sequential Importance Sampling	Mar 2, 2017	Entity Resolution	CodeCode Available	0	5
Optimal Transport-based Alignment of Learned Character Representations for String Similarity	Jul 23, 2019	Entity Resolution	CodeCode Available	0	5
Text2Tracks: Prompt-based Music Recommendation via Generative Retrieval	Mar 31, 2025	Entity ResolutionMusic Recommendation	CodeCode Available	0	5
FairER: Entity Resolution with Fairness Constraints	Oct 30, 2021	DiversityEntity Resolution	CodeCode Available	0	5
Effective Explanations for Entity Resolution Models	Mar 24, 2022	Attributecounterfactual	CodeCode Available	0	5
FlexER: Flexible Entity Resolution for Multiple Intents	Aug 23, 2022	Entity ResolutionGraph Neural Network	CodeCode Available	0	5
Deep Learning for Entity Matching: A Design Space Exploration	May 1, 2018	Deep LearningEntity Linking	CodeCode Available	0	5
Active Gradual Machine Learning for Entity Resolution	Jan 16, 2022	Active LearningBIG-bench Machine Learning	CodeCode Available	0	5
ChatPD: An LLM-driven Paper-Dataset Networking System	May 28, 2025	Entity ResolutionOpen Information Extraction	CodeCode Available	0	5
EAGER: Embedding-Assisted Entity Resolution for Knowledge Graphs	Jan 15, 2021	AttributeEntity Resolution	CodeCode Available	0	5
Graph-boosted Active Learning for Multi-Source Entity Resolution	Sep 30, 2021	Active LearningEntity Resolution	CodeCode Available	0	5
CEREC: A Corpus for Entity Resolution in Email Conversations	May 21, 2021	coreference-resolutionCoreference Resolution	CodeCode Available	0	5
Crowdsourcing and Aggregating Nested Markable Annotations	Jul 1, 2019	coreference-resolutionCoreference Resolution	CodeCode Available	0	5
Accelerating Column Generation via Flexible Dual Optimal Inequalities with Application to Entity Resolution	Sep 12, 2019	ClusteringEntity Resolution	CodeCode Available	0	5
d-blink: Distributed End-to-End Bayesian Entity Resolution	Sep 13, 2019	BlockingEntity Resolution	CodeCode Available	0	5
Deduplication Over Heterogeneous Attribute Types (D-HAT)	Nov 24, 2022	AttributeClustering	CodeCode Available	0	5
Bonafide at LegalLens 2024 Shared Task: Using Lightweight DeBERTa Based Encoder For Legal Violation Detection and Resolution	Oct 30, 2024	Entity ResolutionNatural Language Inference	CodeCode Available	0	5
Biomedical Named Entity Recognition at Scale	Nov 12, 2020	De-identificationEntity Resolution	CodeCode Available	0	5
Cross-Language Learning for Entity Matching	Oct 7, 2021	Cross-Lingual TransferEntity Resolution	CodeCode Available	0	5
A Critical Re-evaluation of Benchmark Datasets for (Deep) Learning-Based Matching Algorithms	Jul 3, 2023	Entity Resolution	CodeCode Available	0	5

Show:10 25 50

← PrevPage 1 of 4Next →

All datasets Amazon-Google Abt-Buy WDC Products-80%cc-seen-medium WDC Computers-small WDC Computers-xlarge WDC Products-50%cc-unseen-medium WDC Watches-small MusicBrainz20K WDC Watches-xlarge WDC Products-80%cc-seen-medium-multi WDC Products

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	gpt4-0613_fewshot-10	F1 (%)	85.21	—	Unverified
2	gpt-4o-mini-2024-07-18_fine_tuned	F1 (%)	80.25	—	Unverified
3	RoBERTa-SupCon	F1 (%)	79.28	—	Unverified
4	RobEM	F1 (%)	79.06	—	Unverified
5	Random Forest	F1 (%)	79	—	Unverified
6	HG	F1 (%)	76.4	—	Unverified
7	Ditto	F1 (%)	75.58	—	Unverified
8	CorDEL-Sum	F1 (%)	70.2	—	Unverified
9	DeepMatcher - Hybrid	F1 (%)	69.3	—	Unverified
10	D-HAT	F1 (%)	67.5	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	gpt4-0613_zeroshot	F1 (%)	95.78	—	Unverified
2	RoBERTa-SupCon	F1 (%)	94.29	—	Unverified
3	gpt-4o-mini-2024-07-18_fine_tuned	F1 (%)	94.09	—	Unverified
4	gpt-4o-2024-08-06	F1 (%)	92.2	—	Unverified
5	RobEM	F1 (%)	90.9	—	Unverified
6	HG	F1 (%)	89.8	—	Unverified
7	Ditto	F1 (%)	89.33	—	Unverified
8	gpt-4o-mini-2024-07-18	F1 (%)	87.68	—	Unverified
9	Meta-Llama-3.1-8B-Instruct_fine_tuned	F1 (%)	87.34	—	Unverified
10	Random Forest	F1 (%)	85	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	gpt4-0613_zeroshot	F1 (%)	89.61	—	Unverified
2	gpt-4o-2024-08-06_fine_tuned_wdc_small	F1 (%)	87.1	—	Unverified
3	gpt-4o-mini-2024-07-18_structured_explanations	F1 (%)	84.38	—	Unverified
4	gpt-4o-mini-2024-07-18	F1 (%)	81.61	—	Unverified
5	RoBERTa-SupCon	F1 (%)	79.99	—	Unverified
6	Llama3.1_70B_structured_explanations	F1 (%)	76.7	—	Unverified
7	Llama3.1_70B	F1 (%)	75.2	—	Unverified
8	Llama3.1_8B_error-based_example_selection	F1 (%)	74.37	—	Unverified
9	Llama3.1_8B_structured_explanations	F1 (%)	74.13	—	Unverified
10	Ditto	F1 (%)	73.93	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	BERT	F1 (%)	96.53	—	Unverified
2	RoBERTa-SupCon	F1 (%)	95.21	—	Unverified
3	HG	F1 (%)	88.5	—	Unverified
4	DADER-MMD	F1 (%)	88	—	Unverified
5	Ditto	F1 (%)	80.76	—	Unverified
6	JointBERT	F1 (%)	77.55	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	RoBERTa-SupCon	F1 (%)	98.33	—	Unverified
2	JointBERT	F1 (%)	97.49	—	Unverified
3	BERT	F1 (%)	97.37	—	Unverified
4	HG	F1 (%)	96.5	—	Unverified
5	Ditto	F1 (%)	95.45	—	Unverified
6	Random Forest	F1 (%)	78	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	RoBERTa-base	F1 (%)	71.14	—	Unverified
2	Ditto	F1 (%)	70.66	—	Unverified
3	HG	F1 (%)	68.74	—	Unverified
4	RoBERTa-SupCon	F1 (%)	57.23	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	HG	F1 (%)	94	—	Unverified
2	DADER-NoDA	F1 (%)	88.6	—	Unverified
3	Ditto	F1 (%)	85.12	—	Unverified
4	JointBERT	F1 (%)	75.83	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ALMSER-GB	F1	0.95	—	Unverified
2	FAMER-SplitMerge	F1	0.88	—	Unverified
3	FAMER-Split	F1	0.84	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	JointBERT	F1 (%)	97.09	—	Unverified
2	Ditto	F1 (%)	96.53	—	Unverified
3	HG	F1 (%)	96.5	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	RoBERTa-SupCon	F1 Micro	88.63	—	Unverified
2	RoBERTa-base	F1 Micro	52.03	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	gpt-4o-2024-08-06_fine_tuned_wdc_small	F1 (%)	87.07	—	Unverified