Entity Resolution

Entity resolution (also known as entity matching, record linkage, or duplicate detection) is the task of finding records that refer to the same real-world entity across different data sources (e.g., data files, books, websites, and databases). (Source: Wikipedia)

Surveys on entity resolution:

The task of entity resolution is closely related to the task of entity alignment which focuses on matching entities between knowledge bases. The task of entity linking differs from entity resolution as entity linking focuses on identifying entity mentions in free text.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–50 of 184 papers

Title	Date	Tasks	Status	Hype
Can Foundation Models Wrangle Your Data?	May 20, 2022	Entity ResolutionImputation	CodeCode Available	5
Fine-tuning Large Language Models for Entity Matching	Sep 12, 2024	Data IntegrationEntity Resolution	CodeCode Available	1
Match, Compare, or Select? An Investigation of Large Language Models for Entity Matching	May 27, 2024	Entity Resolution	CodeCode Available	1
How to Evaluate Entity Resolution Systems: An Entity-Centric Framework with Application to Inventor Name Disambiguation	Apr 8, 2024	Entity Resolution	CodeCode Available	1
Cost-Effective In-Context Learning for Entity Resolution: A Design Space Exploration	Dec 7, 2023	Data IntegrationEntity Resolution	CodeCode Available	1
Entity Matching using Large Language Models	Oct 17, 2023	Data IntegrationEntity Resolution	CodeCode Available	1
Using ChatGPT for Entity Matching	May 5, 2023	Data IntegrationEntity Resolution	CodeCode Available	1
Unicorn: A Unified Multi-tasking Model for Supporting Matching Tasks in Data Integration	May 1, 2023	Data IntegrationEntity Resolution	CodeCode Available	1
Pre-trained Embeddings for Entity Resolution: An Experimental Analysis [Experiment, Analysis & Benchmark]	Apr 24, 2023	BlockingDeep Learning	CodeCode Available	1
WDC Products: A Multi-Dimensional Entity Matching Benchmark	Jan 23, 2023	Contrastive LearningData Integration	CodeCode Available	1
PIZZA: A new benchmark for complex end-to-end task-oriented parsing	Dec 1, 2022	Entity Resolution	CodeCode Available	1
Estimating the Performance of Entity Resolution Algorithms: Lessons Learned Through PatentsView.org	Oct 3, 2022	Entity Resolution	CodeCode Available	1
Entity Resolution with Hierarchical Graph Attention Networks	Jun 1, 2022	AttributeEntity Resolution	CodeCode Available	1
Domain Adaptation for Deep Entity Resolution: A Design Space Exploration	Jun 1, 2022	Data IntegrationDomain Adaptation	CodeCode Available	1
A Critical Re-evaluation of Neural Methods for Entity Alignment	Apr 1, 2022	Entity AlignmentEntity Resolution	CodeCode Available	1
Supervised Contrastive Learning for Product Matching	Feb 4, 2022	Contrastive LearningData Augmentation	CodeCode Available	1
Dual-Objective Fine-Tuning of BERT for Entity Matching	Jun 1, 2021	Data IntegrationEntity Resolution	CodeCode Available	1
Rotom: A Meta-Learned Data Augmentation Framework for Entity Matching, Data Cleaning, Text Classification, and Beyond	Jun 1, 2021	Data AugmentationEntity Resolution	CodeCode Available	1
Deep Indexed Active Learning for Matching Heterogeneous Entity Representations	Apr 8, 2021	Active LearningBlocking	CodeCode Available	1
A Deep Learning Approach to Geographical Candidate Selection through Toponym Matching	Sep 17, 2020	Deep LearningEntity Resolution	CodeCode Available	1
Intermediate Training of BERT for Product Matching	Aug 31, 2020	Entity ResolutionLanguage Modeling	CodeCode Available	1
Deep Entity Matching with Pre-Trained Language Models	Apr 1, 2020	Data AugmentationEntity Resolution	CodeCode Available	1
AutoBlock: A Hands-off Blocking Framework for Entity Matching	Dec 7, 2019	BlockingEntity Resolution	CodeCode Available	1
A Practioner's Guide to Evaluating Entity Resolution Results	Sep 14, 2015	ClusteringEntity Resolution	CodeCode Available	1
ChatPD: An LLM-driven Paper-Dataset Networking System	May 28, 2025	Entity ResolutionOpen Information Extraction	CodeCode Available	0
Automated Construction of a Knowledge Graph of Nuclear Fusion Energy for Effective Elicitation and Retrieval of Information	Apr 10, 2025	Entity Resolutionnamed-entity-recognition	—Unverified	0
Text2Tracks: Prompt-based Music Recommendation via Generative Retrieval	Mar 31, 2025	Entity ResolutionMusic Recommendation	CodeCode Available	0
From Documents to Dialogue: Building KG-RAG Enhanced AI Assistants	Feb 21, 2025	ChatbotEntity Resolution	—Unverified	0
Leveraging User-Generated Metadata of Online Videos for Cover Song Identification	Dec 16, 2024	Cover song identificationEntity Resolution	—Unverified	0
Leveraging large language models for efficient representation learning for entity resolution	Nov 15, 2024	BlockingContrastive Learning	—Unverified	0
Bonafide at LegalLens 2024 Shared Task: Using Lightweight DeBERTa Based Encoder For Legal Violation Detection and Resolution	Oct 30, 2024	Entity ResolutionNatural Language Inference	CodeCode Available	0
Gem: Gaussian Mixture Model Embeddings for Numerical Feature Distributions	Oct 9, 2024	AttributeEntity Resolution	—Unverified	0
T-KAER: Transparency-enhanced Knowledge-Augmented Entity Resolution Framework	Sep 30, 2024	Entity Resolution	CodeCode Available	0
Learning variant product relationship and variation attributes from e-commerce website structures	Sep 17, 2024	Entity ResolutionRAG	—Unverified	0
Entity Augmentation for Efficient Classification of Vertically Partitioned Data with Limited Overlap	Jun 25, 2024	Entity AlignmentEntity Resolution	—Unverified	0
Learning from Natural Language Explanations for Generalizable Entity Matching	Jun 13, 2024	Binary ClassificationDomain Generalization	—Unverified	0
Towards Universal Dense Blocking for Entity Resolution	Apr 23, 2024	BlockingContrastive Learning	CodeCode Available	0
Methods for Matching English Language Addresses	Mar 14, 2024	Entity Resolution	—Unverified	0
Neural Locality Sensitive Hashing for Entity Blocking	Jan 31, 2024	BlockingEntity Resolution	—Unverified	0
Spatial Entity Resolution between Restaurant Locations and Transportation Destinations in Southeast Asia	Jan 16, 2024	Entity Resolution	—Unverified	0
On Leveraging Large Language Models for Enhancing Entity Resolution: A Cost-efficient Approach	Jan 7, 2024	Entity Resolution	—Unverified	0
Cost-Efficient Prompt Engineering for Unsupervised Entity Resolution	Oct 9, 2023	Entity ResolutionFeature Engineering	—Unverified	0
Graph Representation Learning Towards Patents Network Analysis	Sep 25, 2023	Entity ResolutionGraph Representation Learning	—Unverified	0
Labeling without Seeing? Blind Annotation for Privacy-Preserving Entity Resolution	Aug 7, 2023	Dataset GenerationEntity Resolution	—Unverified	0
Revisiting Prompt Engineering via Declarative Crowdsourcing	Aug 7, 2023	Entity ResolutionImputation	—Unverified	0
Named Entity Resolution in Personal Knowledge Graphs	Jul 22, 2023	Entity ResolutionKnowledge Graphs	—Unverified	0
A Critical Re-evaluation of Benchmark Datasets for (Deep) Learning-Based Matching Algorithms	Jul 3, 2023	Entity Resolution	CodeCode Available	0
Record Deduplication for Entity Distribution Modeling in ASR Transcripts	Jun 9, 2023	Entity Resolutionspeech-recognition	—Unverified	0
Combining Global and Local Merges in Logic-based Entity Resolution	May 26, 2023	Entity Resolution	—Unverified	0
Beyond Rule-based Named Entity Recognition and Relation Extraction for Process Model Generation from Natural Language Text	May 6, 2023	Entity ResolutionFeature Engineering	—Unverified	0

Show:10 25 50

← PrevPage 1 of 4Next →

All datasets Amazon-Google Abt-Buy WDC Products-80%cc-seen-medium WDC Computers-small WDC Computers-xlarge WDC Products-50%cc-unseen-medium WDC Watches-small MusicBrainz20K WDC Watches-xlarge WDC Products-80%cc-seen-medium-multi WDC Products

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	gpt4-0613_fewshot-10	F1 (%)	85.21	—	Unverified
2	gpt-4o-mini-2024-07-18_fine_tuned	F1 (%)	80.25	—	Unverified
3	RoBERTa-SupCon	F1 (%)	79.28	—	Unverified
4	RobEM	F1 (%)	79.06	—	Unverified
5	Random Forest	F1 (%)	79	—	Unverified
6	HG	F1 (%)	76.4	—	Unverified
7	Ditto	F1 (%)	75.58	—	Unverified
8	CorDEL-Sum	F1 (%)	70.2	—	Unverified
9	DeepMatcher - Hybrid	F1 (%)	69.3	—	Unverified
10	D-HAT	F1 (%)	67.5	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	gpt4-0613_zeroshot	F1 (%)	95.78	—	Unverified
2	RoBERTa-SupCon	F1 (%)	94.29	—	Unverified
3	gpt-4o-mini-2024-07-18_fine_tuned	F1 (%)	94.09	—	Unverified
4	gpt-4o-2024-08-06	F1 (%)	92.2	—	Unverified
5	RobEM	F1 (%)	90.9	—	Unverified
6	HG	F1 (%)	89.8	—	Unverified
7	Ditto	F1 (%)	89.33	—	Unverified
8	gpt-4o-mini-2024-07-18	F1 (%)	87.68	—	Unverified
9	Meta-Llama-3.1-8B-Instruct_fine_tuned	F1 (%)	87.34	—	Unverified
10	Random Forest	F1 (%)	85	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	gpt4-0613_zeroshot	F1 (%)	89.61	—	Unverified
2	gpt-4o-2024-08-06_fine_tuned_wdc_small	F1 (%)	87.1	—	Unverified
3	gpt-4o-mini-2024-07-18_structured_explanations	F1 (%)	84.38	—	Unverified
4	gpt-4o-mini-2024-07-18	F1 (%)	81.61	—	Unverified
5	RoBERTa-SupCon	F1 (%)	79.99	—	Unverified
6	Llama3.1_70B_structured_explanations	F1 (%)	76.7	—	Unverified
7	Llama3.1_70B	F1 (%)	75.2	—	Unverified
8	Llama3.1_8B_error-based_example_selection	F1 (%)	74.37	—	Unverified
9	Llama3.1_8B_structured_explanations	F1 (%)	74.13	—	Unverified
10	Ditto	F1 (%)	73.93	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	BERT	F1 (%)	96.53	—	Unverified
2	RoBERTa-SupCon	F1 (%)	95.21	—	Unverified
3	HG	F1 (%)	88.5	—	Unverified
4	DADER-MMD	F1 (%)	88	—	Unverified
5	Ditto	F1 (%)	80.76	—	Unverified
6	JointBERT	F1 (%)	77.55	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	RoBERTa-SupCon	F1 (%)	98.33	—	Unverified
2	JointBERT	F1 (%)	97.49	—	Unverified
3	BERT	F1 (%)	97.37	—	Unverified
4	HG	F1 (%)	96.5	—	Unverified
5	Ditto	F1 (%)	95.45	—	Unverified
6	Random Forest	F1 (%)	78	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	RoBERTa-base	F1 (%)	71.14	—	Unverified
2	Ditto	F1 (%)	70.66	—	Unverified
3	HG	F1 (%)	68.74	—	Unverified
4	RoBERTa-SupCon	F1 (%)	57.23	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	HG	F1 (%)	94	—	Unverified
2	DADER-NoDA	F1 (%)	88.6	—	Unverified
3	Ditto	F1 (%)	85.12	—	Unverified
4	JointBERT	F1 (%)	75.83	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	ALMSER-GB	F1	0.95	—	Unverified
2	FAMER-SplitMerge	F1	0.88	—	Unverified
3	FAMER-Split	F1	0.84	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	JointBERT	F1 (%)	97.09	—	Unverified
2	Ditto	F1 (%)	96.53	—	Unverified
3	HG	F1 (%)	96.5	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	RoBERTa-SupCon	F1 Micro	88.63	—	Unverified
2	RoBERTa-base	F1 Micro	52.03	—	Unverified

#	Model	Metric	Claimed	Verified	Status
1	gpt-4o-2024-08-06_fine_tuned_wdc_small	F1 (%)	87.07	—	Unverified