SOTAVerified

Data Integration

Data integration (also called information integration) is the process of consolidating data from a set of heterogeneous data sources into a single uniform data set (materialized integration) or view on the data (virtual integration). Data integration pipelines involve subtasks such as schema matching, table annotation, entity resolution, value normalization, data cleansing, and data fusion. Application domains of data integration include data warehousing, data lakes, and knowledge base consolidation. Surveys on Data integration:

Papers

Showing 4150 of 431 papers

TitleStatusHype
eipy: An Open-Source Python Package for Multi-modal Data Integration using Heterogeneous EnsemblesCode1
A Variational Information Bottleneck Approach to Multi-Omics Data IntegrationCode1
Conformal Trajectory Prediction with Multi-View Data Integration in Cooperative DrivingCode1
Fine-tuning Large Language Models for Entity MatchingCode1
BayReL: Bayesian Relational Learning for Multi-omics Data IntegrationCode1
GripNet: Graph Information Propagation on Supergraph for Heterogeneous GraphsCode1
Know2BIO: A Comprehensive Dual-View Benchmark for Evolving Biomedical Knowledge GraphsCode1
KramaBench: A Benchmark for AI Systems on Data-to-Insight Pipelines over Data LakesCode1
COMO: A Pipeline for Multi-Omics Data Integration in Metabolic Modeling and Drug DiscoveryCode1
WDC Products: A Multi-Dimensional Entity Matching BenchmarkCode1
Show:102550
← PrevPage 5 of 44Next →

No leaderboard results yet.