SOTAVerified

Blocking

Entity resolution (also known as entity matching, record linkage, or duplicate detection) is the task of finding records that refer to the same real-world entity across different data sources (e.g., data files, books, websites, and databases). (Source: Wikipedia)

Blocking is a crucial step in any entity resolution pipeline because a pair-wise comparison of all records across two data sources is infeasible. Blocking applies a computationally cheap method to generate a smaller set of candidate record pairs reducing the workload of the matcher. During matching a more expensive pair-wise matcher generates a final set of matching record pairs.

Survey on blocking:

Papers

Showing 150 of 524 papers

TitleStatusHype
SemanticDraw: Towards Real-Time Interactive Content Creation from Image Diffusion ModelsCode4
SINet: Extreme Lightweight Portrait Segmentation Networks with Spatial Squeeze Modules and Information Blocking DecoderCode2
ScatterFormer: Efficient Voxel Transformer with Scattered Linear AttentionCode2
Open CaptchaWorld: A Comprehensive Web-based Platform for Testing and Benchmarking Multimodal LLM AgentsCode2
AdFlush: A Real-World Deployable Machine Learning Solution for Effective Advertisement and Web Tracker PreventionCode2
Efficient LLM Scheduling by Learning to RankCode2
Wavelet Diffusion Models are fast and scalable Image GeneratorsCode2
Efficient Interactive LLM Serving with Proxy Model-based Sequence Length PredictionCode1
Should Graph Convolution Trust Neighbors? A Simple Causal Inference MethodCode1
StyleSwin: Transformer-based GAN for High-resolution Image GenerationCode1
S^2-Transformer for Mask-Aware Hyperspectral Image ReconstructionCode1
Self-Destructing Models: Increasing the Costs of Harmful Dual Uses of Foundation ModelsCode1
ABINet++: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text SpottingCode1
Sparkly: A Simple yet Surprisingly Strong TF/IDF Blocker for Entity MatchingCode1
O^2-Recon: Completing 3D Reconstruction of Occluded Objects in the Scene with a Pre-trained 2D Diffusion ModelCode1
NeuralPassthrough: Learned Real-Time View Synthesis for VRCode1
Path-Specific Counterfactual Fairness for Recommender SystemsCode1
ISTA-Net++: Flexible Deep Unfolding Network for Compressive SensingCode1
Masked Graph Autoencoder with Non-discrete BandwidthsCode1
Multi-scale Attention Network for Single Image Super-ResolutionCode1
Neural Text Generation with Unlikelihood TrainingCode1
Queue management for slo-oriented large language model servingCode1
Reinforcement Learning for Dynamic Resource Allocation in Optical Networks: Hype or Hope?Code1
Road Planning for Slums via Deep Reinforcement LearningCode1
DynamicViT: Efficient Vision Transformers with Dynamic Token SparsificationCode1
Backdoor Attacks on Vision TransformersCode1
Learning a Single Model with a Wide Range of Quality Factors for JPEG Image Artifacts RemovalCode1
A Knowledge Graph Embeddings based Approach for Author Name Disambiguation using LiteralsCode1
Gandalf the Red: Adaptive Security for LLMsCode1
AMP-Net: Denoising based Deep Unfolding for Compressive Image SensingCode1
AltDiffusion: A Multilingual Text-to-Image Diffusion ModelCode1
AutoBlock: A Hands-off Blocking Framework for Entity MatchingCode1
Deep Indexed Active Learning for Matching Heterogeneous Entity RepresentationsCode1
From General to Specific: Informative Scene Graph Generation via Balance AdjustmentCode1
AutoDAN: Interpretable Gradient-Based Adversarial Attacks on Large Language ModelsCode1
AutoFR: Automated Filter Rule Generation for AdblockingCode1
Generate, Prune, Select: A Pipeline for Counterspeech Generation against Online Hate SpeechCode1
GraphSHA: Synthesizing Harder Samples for Class-Imbalanced Node ClassificationCode1
COAST: COntrollable Arbitrary-Sampling NeTwork for Compressive SensingCode1
Boosting Multi-view Stereo with Late Cost AggregationCode1
MDFlow: Unsupervised Optical Flow Learning by Reliable Mutual Knowledge DistillationCode1
Multi-granularity Contrastive Cross-modal Collaborative Generation for End-to-End Long-term Video Question AnsweringCode1
LinkTransformer: A Unified Package for Record Linkage with Transformer Language ModelsCode1
NoLoCo: No-all-reduce Low Communication Training Method for Large ModelsCode1
Content-aware Scalable Deep Compressed SensingCode1
CORBA: Contagious Recursive Blocking Attacks on Multi-Agent Systems Based on Large Language ModelsCode1
Progent: Programmable Privilege Control for LLM AgentsCode1
Deep learning for blocking in entity matching: a design space explorationCode1
A Novel Geo-Localization Method for UAV and Satellite Images Using Cross-View Consistent AttentionCode1
Pre-trained Embeddings for Entity Resolution: An Experimental Analysis [Experiment, Analysis & Benchmark]Code1
Show:102550
← PrevPage 1 of 11Next →

No leaderboard results yet.