SOTAVerified

Blocking

Entity resolution (also known as entity matching, record linkage, or duplicate detection) is the task of finding records that refer to the same real-world entity across different data sources (e.g., data files, books, websites, and databases). (Source: Wikipedia)

Blocking is a crucial step in any entity resolution pipeline because a pair-wise comparison of all records across two data sources is infeasible. Blocking applies a computationally cheap method to generate a smaller set of candidate record pairs reducing the workload of the matcher. During matching a more expensive pair-wise matcher generates a final set of matching record pairs.

Survey on blocking:

Papers

Showing 150 of 524 papers

TitleStatusHype
SemanticDraw: Towards Real-Time Interactive Content Creation from Image Diffusion ModelsCode4
Open CaptchaWorld: A Comprehensive Web-based Platform for Testing and Benchmarking Multimodal LLM AgentsCode2
Efficient LLM Scheduling by Learning to RankCode2
AdFlush: A Real-World Deployable Machine Learning Solution for Effective Advertisement and Web Tracker PreventionCode2
ScatterFormer: Efficient Voxel Transformer with Scattered Linear AttentionCode2
Wavelet Diffusion Models are fast and scalable Image GeneratorsCode2
SINet: Extreme Lightweight Portrait Segmentation Networks with Spatial Squeeze Modules and Information Blocking DecoderCode2
NoLoCo: No-all-reduce Low Communication Training Method for Large ModelsCode1
Progent: Programmable Privilege Control for LLM AgentsCode1
CORBA: Contagious Recursive Blocking Attacks on Multi-Agent Systems Based on Large Language ModelsCode1
Reinforcement Learning for Dynamic Resource Allocation in Optical Networks: Hype or Hope?Code1
Gandalf the Red: Adaptive Security for LLMsCode1
Multi-granularity Contrastive Cross-modal Collaborative Generation for End-to-End Long-term Video Question AnsweringCode1
Queue management for slo-oriented large language model servingCode1
Efficient Interactive LLM Serving with Proxy Model-based Sequence Length PredictionCode1
Masked Graph Autoencoder with Non-discrete BandwidthsCode1
Boosting Multi-view Stereo with Late Cost AggregationCode1
AutoDAN: Interpretable Gradient-Based Adversarial Attacks on Large Language ModelsCode1
A Novel Geo-Localization Method for UAV and Satellite Images Using Cross-View Consistent AttentionCode1
LinkTransformer: A Unified Package for Record Linkage with Transformer Language ModelsCode1
AltDiffusion: A Multilingual Text-to-Image Diffusion ModelCode1
O^2-Recon: Completing 3D Reconstruction of Occluded Objects in the Scene with a Pre-trained 2D Diffusion ModelCode1
GraphSHA: Synthesizing Harder Samples for Class-Imbalanced Node ClassificationCode1
Path-Specific Counterfactual Fairness for Recommender SystemsCode1
Road Planning for Slums via Deep Reinforcement LearningCode1
Pre-trained Embeddings for Entity Resolution: An Experimental Analysis [Experiment, Analysis & Benchmark]Code1
Sparkly: A Simple yet Surprisingly Strong TF/IDF Blocker for Entity MatchingCode1
Tracker Meets Night: A Transformer Enhancer for UAV TrackingCode1
Self-Destructing Models: Increasing the Costs of Harmful Dual Uses of Foundation ModelsCode1
ABINet++: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text SpottingCode1
MDFlow: Unsupervised Optical Flow Learning by Reliable Mutual Knowledge DistillationCode1
Multi-scale Attention Network for Single Image Super-ResolutionCode1
S^2-Transformer for Mask-Aware Hyperspectral Image ReconstructionCode1
Content-aware Scalable Deep Compressed SensingCode1
Sudowoodo: Contrastive Self-supervised Learning for Multi-purpose Data Integration and PreparationCode1
NeuralPassthrough: Learned Real-Time View Synthesis for VRCode1
Backdoor Attacks on Vision TransformersCode1
User-controllable Recommendation Against Filter BubblesCode1
AutoFR: Automated Filter Rule Generation for AdblockingCode1
A Knowledge Graph Embeddings based Approach for Author Name Disambiguation using LiteralsCode1
StyleSwin: Transformer-based GAN for High-resolution Image GenerationCode1
From General to Specific: Informative Scene Graph Generation via Balance AdjustmentCode1
COAST: COntrollable Arbitrary-Sampling NeTwork for Compressive SensingCode1
Deep learning for blocking in entity matching: a design space explorationCode1
DynamicViT: Efficient Vision Transformers with Dynamic Token SparsificationCode1
Generate, Prune, Select: A Pipeline for Counterspeech Generation against Online Hate SpeechCode1
Deep Indexed Active Learning for Matching Heterogeneous Entity RepresentationsCode1
ISTA-Net++: Flexible Deep Unfolding Network for Compressive SensingCode1
Time-Ordered Recent Event (TORE) Volumes for Event CamerasCode1
WearMask: Fast In-browser Face Mask Detection with Serverless Edge Computing for COVID-19Code1
Show:102550
← PrevPage 1 of 11Next →

No leaderboard results yet.