SOTAVerified

Data Integration

Data integration (also called information integration) is the process of consolidating data from a set of heterogeneous data sources into a single uniform data set (materialized integration) or view on the data (virtual integration). Data integration pipelines involve subtasks such as schema matching, table annotation, entity resolution, value normalization, data cleansing, and data fusion. Application domains of data integration include data warehousing, data lakes, and knowledge base consolidation. Surveys on Data integration:

Papers

Showing 351400 of 431 papers

TitleStatusHype
Evaluating Blocking Biases in Entity MatchingCode0
Leveraging Legacy Data to Accelerate Materials Design via Preference LearningCode0
Semantic Web: Past, Present, and FutureCode0
Towards a Spatiotemporal Fusion Approach to Precipitation NowcastingCode0
mvlearnR and Shiny App for multiview learningCode0
Lipschitz-regularized gradient flows and generative particle algorithms for high-dimensional scarce dataCode0
Cross-Species Data Integration for Enhanced Layer Segmentation in Kidney PathologyCode0
Federated Learning in Chemical Engineering: A Tutorial on a Framework for Privacy-Preserving Collaboration Across Distributed Data SourcesCode0
AdapterEM: Pre-trained Language Model Adaptation for Generalized Entity Matching using Adapter-tuningCode0
Bayesian Hybrid Matrix Factorisation for Data IntegrationCode0
Evaluating approaches for supervised semantic labelingCode0
Cross Modal Data Discovery over Structured and Unstructured Data LakesCode0
An attention model to analyse the risk of agitation and urinary tract infections in people with dementiaCode0
From Classical Machine Learning to Emerging Foundation Models: Review on Multimodal Data Integration for Cancer ResearchCode0
From Swath to Full-Disc: Advancing Precipitation Retrieval with Multimodal Knowledge ExpansionCode0
Mining the contribution of intensive care clinical course to outcome after traumatic brain injuryCode0
Evaluating AI capabilities in detecting conspiracy theories on YouTubeCode0
SOTAB: The WDC Schema.org Table Annotation BenchmarkCode0
Profiling Entity Matching Benchmark TasksCode0
Gaussian Copula Models for Nonignorable Missing Data Using Auxiliary Marginal QuantilesCode0
Gaussian Process Emulators for Few-Shot Segmentation in Cardiac MRICode0
Generalized probabilistic canonical correlation analysis for multi-modal data integration with full or partial observationsCode0
Neuro-symbolic representation learning on biological knowledge graphsCode0
PulmoFusion: Advancing Pulmonary Health with Efficient Multi-Modal FusionCode0
MIMIC-III, a freely accessible critical care databaseCode0
OmniEcon Nexus: Global Microeconomic Simulation EngineCode0
A Survey of Pipeline Tools for Data EngineeringCode0
Graph Integration for Diffusion-Based Manifold AlignmentCode0
Reducing Biases in Record Matching Through Scores CalibrationCode0
VeeAlign: Multifaceted Context Representation using Dual Attention for Ontology AlignmentCode0
Entropic Optimal Transport Eigenmaps for Nonlinear Alignment and Joint Embedding of High-Dimensional DatasetsCode0
GraphSeqLM: A Unified Graph Language Framework for Omic Graph LearningCode0
Stacked Autoencoder Based Multi-Omics Data Integration for Cancer Survival PredictionCode0
Enhancing Glucose Level Prediction of ICU Patients through Hierarchical Modeling of Irregular Time-SeriesCode0
Heter-LP: A heterogeneous label propagation algorithm and its application in drug repositioningCode0
Modelling Technical and Biological Effects in scRNA-seq data with Scalable GPLVMsCode0
MODIS: Multi-Omics Data Integration for Small and Unpaired DatasetsCode0
Moment-based parameter inference with error guarantees for stochastic reaction networksCode0
Consistent and Flexible Selectivity Estimation for High-Dimensional DataCode0
Temporally Aligning Long Audio Interviews with Questions: A Case Study in Multimodal Data IntegrationCode0
IAM: Enhancing RGB-D Instance Segmentation with New BenchmarksCode0
Reconstructing Nonlinear Dynamical Systems from Multi-Modal Time SeriesCode0
Multi-dataset and Transfer Learning Using Gene Expression Knowledge GraphsCode0
Elastic Coupled Co-clustering for Single-Cell Genomic DataCode0
The Battleship Approach to the Low Resource Entity Matching ProblemCode0
The Cell Ontology in the age of single-cell omicsCode0
Combining Experimental and Historical Data for Policy EvaluationCode0
Efficient Vertical Federated Learning Method for Ridge Regression of Large-Scale Samples via Least-Squares SolutionCode0
Integrated community occupancy models: A framework to assess occurrence and biodiversity dynamics using multiple data sourcesCode0
Integrating Heterogeneous Gene Expression Data through Knowledge Graphs for Improving Diabetes PredictionCode0
Show:102550
← PrevPage 8 of 9Next →

No leaderboard results yet.