SOTAVerified

Blocking

Entity resolution (also known as entity matching, record linkage, or duplicate detection) is the task of finding records that refer to the same real-world entity across different data sources (e.g., data files, books, websites, and databases). (Source: Wikipedia)

Blocking is a crucial step in any entity resolution pipeline because a pair-wise comparison of all records across two data sources is infeasible. Blocking applies a computationally cheap method to generate a smaller set of candidate record pairs reducing the workload of the matcher. During matching a more expensive pair-wise matcher generates a final set of matching record pairs.

Survey on blocking:

Papers

Showing 150 of 524 papers

TitleStatusHype
An introduction to Causal Modelling0
NoLoCo: No-all-reduce Low Communication Training Method for Large ModelsCode1
Pushing the Limits of Extreme Weather: Constructing Extreme Heatwave Storylines with Differentiable Climate ModelsCode0
Challenges in Automated Processing of Speech from Child Wearables: The Case of Voice Type Classifier0
Matching Markets Meet LLMs: Algorithmic Reasoning with Ranked Preferences0
The Coupling Effect of Sensing Targets on the Environment for 3GPP ISAC Channels: Observation, Modeling, and Validation0
Decoding Knowledge Attribution in Mixture-of-Experts: A Framework of Basic-Refinement Collaboration and Efficiency Analysis0
Open CaptchaWorld: A Comprehensive Web-based Platform for Testing and Benchmarking Multimodal LLM AgentsCode2
Sensitivity of DC Network Representation for GIC Analysis0
Derailing Non-Answers via Logit Suppression at Output Subspace Boundaries in RLHF-Aligned Language Models0
Streamlining Resilient Kubernetes Autoscaling with Multi-Agent Systems via an Automated Online Design Framework0
Generative RLHF-V: Learning Principles from Multi-modal Human Preference0
AI-empowered Channel Estimation for Block-based Active IRS-enhanced Hybrid-field IoT Network0
ELIS: Efficient LLM Iterative Scheduling System with Response Length Predictor0
Non-Blocking Robustness Analysis in Discrete Event Systems0
Using mathematical models of heart cells to assess the safety of new pharmaceutical drugs0
LithOS: An Operating System for Efficient Machine Learning on GPUs0
Leveraging Language Models for Automated Patient Record Linkage0
Beamforming Design and Association Scheme for Multi-RIS Multi-User mmWave Systems Through Graph Neural Networks0
Improvable Students in School Choice0
Progent: Programmable Privilege Control for LLM AgentsCode1
Ctrl-Z: Controlling AI Agents via Resampling0
Statistical Linear Regression Approach to Kalman Filtering and Smoothing under Cyber-Attacks0
Deep Learning Meets Teleconnections: Improving S2S Predictions for European Winter WeatherCode0
On-Chip and Off-Chip TIA Amplifiers for Nanopore Signal Readout Design, Performance and Challenges: A Review0
Generative Classifier for Domain Generalization0
Matching, Unanticipated Experiences, Divorce, Flirting, Rematching, Etc0
Priority-Aware Preemptive Scheduling for Mixed-Priority Workloads in MoE Inference0
Identification of Minimally Restrictive Assembly Sequences using Supervisory Control Theory0
Fault Localization and State Estimation of Power Grid under Parallel Cyber-Physical Attacks0
CORBA: Contagious Recursive Blocking Attacks on Multi-Agent Systems Based on Large Language ModelsCode1
Autellix: An Efficient Serving Engine for LLM Agents as General Programs0
Observability-Blocking Controls for Double-Integrator and Higher Order Integrator Networks0
Minimizing Instability in Strategy-Proof Matching Mechanism Using A Linear Programming Approach0
Reinforcement Learning for Dynamic Resource Allocation in Optical Networks: Hype or Hope?Code1
Evolving Hate Speech Online: An Adaptive Framework for Detection and Mitigation0
Linking Cryptoasset Attribution Tags to Knowledge Graph Entities: An LLM-based ApproachCode0
Democratizing AI: Open-source Scalable LLM Training on GPU-based Supercomputers0
DiffIM: Differentiable Influence Minimization with Surrogate Modeling and Continuous RelaxationCode0
Leveraging Large Language Models to Predict Antibody Biological Activity Against Influenza A Hemagglutinin0
Enhancing Model Defense Against Jailbreaks with Proactive Safety Reasoning0
Replacing the Gallium Oxide Shell with Conductive Ag: Toward a Printable and Recyclable Composite for Highly Stretchable Electronics, Electromagnetic Shielding, and Thermal Interfaces0
Foundation for unbiased cross-validation of spatio-temporal models for species distribution modelingCode0
CAMEO: Autocorrelation-Preserving Line Simplification for Lossy Time Series Compression0
Hybrid Parallel Collaborative Simulation Framework Integrating Device Physics with Circuit Dynamics for PDAE-Modeled Power Electronic Equipment0
Broadband measurements and analysis of human blocking in a 60 GHz indoor radio channel0
Gandalf the Red: Adaptive Security for LLMsCode1
Network Diffuser for Placing-Scheduling Service Function Chains with Inverse Demonstration0
mFabric: An Efficient and Scalable Fabric for Mixture-of-Experts Training0
ABACUS: A FinOps Service for Cloud Cost Optimization0
Show:102550
← PrevPage 1 of 11Next →

No leaderboard results yet.