Blocking

Entity resolution (also known as entity matching, record linkage, or duplicate detection) is the task of finding records that refer to the same real-world entity across different data sources (e.g., data files, books, websites, and databases). (Source: Wikipedia)

Blocking is a crucial step in any entity resolution pipeline because a pair-wise comparison of all records across two data sources is infeasible. Blocking applies a computationally cheap method to generate a smaller set of candidate record pairs reducing the workload of the matcher. During matching a more expensive pair-wise matcher generates a final set of matching record pairs.

Survey on blocking:

Papadakis et al.: Blocking and Filtering Techniques for Entity Resolution: A Survey, 2020.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–50 of 524 papers

Title	Date	Tasks	Status	Hype	Score
SemanticDraw: Towards Real-Time Interactive Content Creation from Image Diffusion Models	Mar 14, 2024	BlockingGPU	CodeCode Available	4	5
SINet: Extreme Lightweight Portrait Segmentation Networks with Spatial Squeeze Modules and Information Blocking Decoder	Nov 20, 2019	BlockingDecoder	CodeCode Available	2	5
ScatterFormer: Efficient Voxel Transformer with Scattered Linear Attention	Jan 1, 2024	Blocking	CodeCode Available	2	5
Open CaptchaWorld: A Comprehensive Web-based Platform for Testing and Benchmarking Multimodal LLM Agents	May 30, 2025	BenchmarkingBlocking	CodeCode Available	2	5
AdFlush: A Real-World Deployable Machine Learning Solution for Effective Advertisement and Web Tracker Prevention	May 13, 2024	BlockingCPU	CodeCode Available	2	5
Efficient LLM Scheduling by Learning to Rank	Aug 28, 2024	BlockingChatbot	CodeCode Available	2	5
Wavelet Diffusion Models are fast and scalable Image Generators	Nov 29, 2022	BlockingImage Generation	CodeCode Available	2	5
Efficient Interactive LLM Serving with Proxy Model-based Sequence Length Prediction	Apr 12, 2024	BlockingManagement	CodeCode Available	1	5
Should Graph Convolution Trust Neighbors? A Simple Causal Inference Method	Oct 22, 2020	BlockingCausal Inference	CodeCode Available	1	5
StyleSwin: Transformer-based GAN for High-resolution Image Generation	Dec 20, 2021	BlockingComputational Efficiency	CodeCode Available	1	5
S^2-Transformer for Mask-Aware Hyperspectral Image Reconstruction	Sep 24, 2022	BlockingDisentanglement	CodeCode Available	1	5
Self-Destructing Models: Increasing the Costs of Harmful Dual Uses of Foundation Models	Nov 27, 2022	BlockingMeta-Learning	CodeCode Available	1	5
ABINet++: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Spotting	Nov 19, 2022	BlockingLanguage Modeling	CodeCode Available	1	5
Sparkly: A Simple yet Surprisingly Strong TF/IDF Blocker for Entity Matching	Apr 20, 2023	Blocking	CodeCode Available	1	5
O^2-Recon: Completing 3D Reconstruction of Occluded Objects in the Scene with a Pre-trained 2D Diffusion Model	Aug 18, 2023	3D ReconstructionBlocking	CodeCode Available	1	5
NeuralPassthrough: Learned Real-Time View Synthesis for VR	Jul 5, 2022	BlockingNeural Rendering	CodeCode Available	1	5
Path-Specific Counterfactual Fairness for Recommender Systems	Jun 5, 2023	Blockingcounterfactual	CodeCode Available	1	5
ISTA-Net++: Flexible Deep Unfolding Network for Compressive Sensing	Mar 22, 2021	BlockingCompressive Sensing	CodeCode Available	1	5
Masked Graph Autoencoder with Non-discrete Bandwidths	Feb 6, 2024	BlockingLink Prediction	CodeCode Available	1	5
Multi-scale Attention Network for Single Image Super-Resolution	Sep 28, 2022	BlockingImage Super-Resolution	CodeCode Available	1	5
Neural Text Generation with Unlikelihood Training	Aug 12, 2019	BlockingText Generation	CodeCode Available	1	5
Queue management for slo-oriented large language model serving	Jun 5, 2024	BlockingGPU	CodeCode Available	1	5
Reinforcement Learning for Dynamic Resource Allocation in Optical Networks: Hype or Hope?	Feb 18, 2025	BenchmarkingBlocking	CodeCode Available	1	5
Road Planning for Slums via Deep Reinforcement Learning	May 22, 2023	BlockingDeep Reinforcement Learning	CodeCode Available	1	5
DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification	Jun 3, 2021	BlockingEfficient ViTs	CodeCode Available	1	5
Backdoor Attacks on Vision Transformers	Jun 16, 2022	Blocking	CodeCode Available	1	5
Learning a Single Model with a Wide Range of Quality Factors for JPEG Image Artifacts Removal	Sep 15, 2020	BlockingJPEG Artifact Correction	CodeCode Available	1	5
A Knowledge Graph Embeddings based Approach for Author Name Disambiguation using Literals	Jan 24, 2022	ArticlesBlocking	CodeCode Available	1	5
Gandalf the Red: Adaptive Security for LLMs	Jan 14, 2025	BlockingLanguage Modeling	CodeCode Available	1	5
AMP-Net: Denoising based Deep Unfolding for Compressive Image Sensing	Apr 21, 2020	BlockingCompressive Sensing	CodeCode Available	1	5
AltDiffusion: A Multilingual Text-to-Image Diffusion Model	Aug 19, 2023	BlockingConcept Alignment	CodeCode Available	1	5
AutoBlock: A Hands-off Blocking Framework for Entity Matching	Dec 7, 2019	BlockingEntity Resolution	CodeCode Available	1	5
Deep Indexed Active Learning for Matching Heterogeneous Entity Representations	Apr 8, 2021	Active LearningBlocking	CodeCode Available	1	5
From General to Specific: Informative Scene Graph Generation via Balance Adjustment	Aug 30, 2021	BlockingGraph Generation	CodeCode Available	1	5
AutoDAN: Interpretable Gradient-Based Adversarial Attacks on Large Language Models	Oct 23, 2023	Adversarial AttackBlocking	CodeCode Available	1	5
AutoFR: Automated Filter Rule Generation for Adblocking	Feb 25, 2022	Blocking	CodeCode Available	1	5
Generate, Prune, Select: A Pipeline for Counterspeech Generation against Online Hate Speech	Jun 3, 2021	BlockingDiversity	CodeCode Available	1	5
GraphSHA: Synthesizing Harder Samples for Class-Imbalanced Node Classification	Jun 16, 2023	BlockingClassification	CodeCode Available	1	5
COAST: COntrollable Arbitrary-Sampling NeTwork for Compressive Sensing	Jul 15, 2021	BlockingCompressive Sensing	CodeCode Available	1	5
Boosting Multi-view Stereo with Late Cost Aggregation	Jan 22, 2024	BlockingGeometric Matching	CodeCode Available	1	5
MDFlow: Unsupervised Optical Flow Learning by Reliable Mutual Knowledge Distillation	Nov 11, 2022	BlockingData Augmentation	CodeCode Available	1	5
Multi-granularity Contrastive Cross-modal Collaborative Generation for End-to-End Long-term Video Question Answering	Oct 12, 2024	Answer GenerationBlocking	CodeCode Available	1	5
LinkTransformer: A Unified Package for Record Linkage with Transformer Language Models	Sep 2, 2023	BlockingLanguage Modelling	CodeCode Available	1	5
NoLoCo: No-all-reduce Low Communication Training Method for Large Models	Jun 12, 2025	AllBlocking	CodeCode Available	1	5
Content-aware Scalable Deep Compressed Sensing	Jul 19, 2022	Blockingcompressed sensing	CodeCode Available	1	5
CORBA: Contagious Recursive Blocking Attacks on Multi-Agent Systems Based on Large Language Models	Feb 20, 2025	BlockingLanguage Modeling	CodeCode Available	1	5
Progent: Programmable Privilege Control for LLM Agents	Apr 16, 2025	Blocking	CodeCode Available	1	5
Deep learning for blocking in entity matching: a design space exploration	Jul 1, 2021	Blocking	CodeCode Available	1	5
A Novel Geo-Localization Method for UAV and Satellite Images Using Cross-View Consistent Attention	Sep 23, 2023	BlockingData Augmentation	CodeCode Available	1	5
Pre-trained Embeddings for Entity Resolution: An Experimental Analysis [Experiment, Analysis & Benchmark]	Apr 24, 2023	BlockingDeep Learning	CodeCode Available	1	5

Show:10 25 50

← PrevPage 1 of 11Next →

No leaderboard results yet.