SOTAVerified

Data Integration

Data integration (also called information integration) is the process of consolidating data from a set of heterogeneous data sources into a single uniform data set (materialized integration) or view on the data (virtual integration). Data integration pipelines involve subtasks such as schema matching, table annotation, entity resolution, value normalization, data cleansing, and data fusion. Application domains of data integration include data warehousing, data lakes, and knowledge base consolidation. Surveys on Data integration:

Papers

Showing 51100 of 431 papers

TitleStatusHype
Is your data alignable? Principled and interpretable alignability testing and integration of single-cell dataCode1
WDC Products: A Multi-Dimensional Entity Matching BenchmarkCode1
Multimodal Quantum Natural Language Processing: A Novel Framework for using Quantum Methods to Analyse Real DataCode0
Alternative Telescopic Displacement: An Efficient Multimodal Alignment MethodCode0
Multimodal Contextualized Semantic Parsing from SpeechCode0
MODIS: Multi-Omics Data Integration for Small and Unpaired DatasetsCode0
Modelling Technical and Biological Effects in scRNA-seq data with Scalable GPLVMsCode0
Moment-based parameter inference with error guarantees for stochastic reaction networksCode0
A Systematic Approach to Featurization for Cancer Drug Sensitivity Predictions with Deep LearningCode0
Multi-dataset and Transfer Learning Using Gene Expression Knowledge GraphsCode0
A Unified Joint Matrix Factorization Framework for Data IntegrationCode0
AdapterEM: Pre-trained Language Model Adaptation for Generalized Entity Matching using Adapter-tuningCode0
Reducing Biases in Record Matching Through Scores CalibrationCode0
Multi-Omic Data Integration and Feature Selection for Survival-based Patient Stratification via Supervised Concrete AutoencodersCode0
A Survey of Pipeline Tools for Data EngineeringCode0
Leveraging Legacy Data to Accelerate Materials Design via Preference LearningCode0
Learning to Characterize Matching ExpertsCode0
Lipschitz-regularized gradient flows and generative particle algorithms for high-dimensional scarce dataCode0
Kernel learning approaches for summarising and combining posterior similarity matricesCode0
LLMs in Software Security: A Survey of Vulnerability Detection Techniques and InsightsCode0
Integrating Weather Station Data and Radar for Precipitation Nowcasting: SmaAt-fUsion and SmaAt-Krige-GNetCode0
Integrating Heterogeneous Gene Expression Data through Knowledge Graphs for Improving Diabetes PredictionCode0
Joint Estimation and Inference for Data Integration Problems based on Multiple Multi-layered Gaussian Graphical ModelsCode0
Learn2Mix: Training Neural Networks Using Adaptive Data IntegrationCode0
MIMIC-III, a freely accessible critical care databaseCode0
Multi-Task Adversarial Variational Autoencoder for Estimating Biological Brain Age with Multimodal NeuroimagingCode0
IAM: Enhancing RGB-D Instance Segmentation with New BenchmarksCode0
GraphSeqLM: A Unified Graph Language Framework for Omic Graph LearningCode0
Reconstructing Nonlinear Dynamical Systems from Multi-Modal Time SeriesCode0
Gaussian Process Emulators for Few-Shot Segmentation in Cardiac MRICode0
Generalized probabilistic canonical correlation analysis for multi-modal data integration with full or partial observationsCode0
Graph Integration for Diffusion-Based Manifold AlignmentCode0
From Classical Machine Learning to Emerging Foundation Models: Review on Multimodal Data Integration for Cancer ResearchCode0
From Swath to Full-Disc: Advancing Precipitation Retrieval with Multimodal Knowledge ExpansionCode0
Federated Learning in Chemical Engineering: A Tutorial on a Framework for Privacy-Preserving Collaboration Across Distributed Data SourcesCode0
Evaluating AI capabilities in detecting conspiracy theories on YouTubeCode0
CAVACHON: a hierarchical variational autoencoder to integrate multi-modal single-cell dataCode0
Evaluating approaches for supervised semantic labelingCode0
Entropic Optimal Transport Eigenmaps for Nonlinear Alignment and Joint Embedding of High-Dimensional DatasetsCode0
Combining Experimental and Historical Data for Policy EvaluationCode0
Evaluating Blocking Biases in Entity MatchingCode0
Gaussian Copula Models for Nonignorable Missing Data Using Auxiliary Marginal QuantilesCode0
Comparative Analysis of Multi-Omics Integration Using Advanced Graph Neural Networks for Cancer ClassificationCode0
Heter-LP: A heterogeneous label propagation algorithm and its application in drug repositioningCode0
An Empirical Meta-analysis of the Life Sciences (Linked?) Open Data on the WebCode0
Consistent and Flexible Selectivity Estimation for High-Dimensional DataCode0
Efficient Vertical Federated Learning Method for Ridge Regression of Large-Scale Samples via Least-Squares SolutionCode0
Intermediate triple table: A general architecture for virtual knowledge graphsCode0
Building Flexible, Scalable, and Machine Learning-ready Multimodal Oncology DatasetsCode0
An attention model to analyse the risk of agitation and urinary tract infections in people with dementiaCode0
Show:102550
← PrevPage 2 of 9Next →

No leaderboard results yet.