SOTAVerified

Data Integration

Data integration (also called information integration) is the process of consolidating data from a set of heterogeneous data sources into a single uniform data set (materialized integration) or view on the data (virtual integration). Data integration pipelines involve subtasks such as schema matching, table annotation, entity resolution, value normalization, data cleansing, and data fusion. Application domains of data integration include data warehousing, data lakes, and knowledge base consolidation. Surveys on Data integration:

Papers

Showing 51100 of 431 papers

TitleStatusHype
BayReL: Bayesian Relational Learning for Multi-omics Data IntegrationCode1
SumGNN: Multi-typed Drug Interaction Prediction via Efficient Knowledge Graph SummarizationCode1
From Classical Machine Learning to Emerging Foundation Models: Review on Multimodal Data Integration for Cancer ResearchCode0
Empowering Digital Agriculture: A Privacy-Preserving Framework for Data Sharing and Collaborative Research0
Intelligent Operation and Maintenance and Prediction Model Optimization for Improving Wind Power Generation Efficiency0
Ring-lite: Scalable Reasoning via C3PO-Stabilized Reinforcement Learning for LLMs0
Brain Imaging Foundation Models, Are We There Yet? A Systematic Review of Foundation Models for Brain Imaging and Biomedical Research0
Leveraging MIMIC Datasets for Better Digital Health: A Review on Open Problems, Progress Highlights, and Future Promises0
Enhancing Bagging Ensemble Regression with Data Integration for Time Series-Based Diabetes Prediction0
The Cell Ontology in the age of single-cell omicsCode0
From Swath to Full-Disc: Advancing Precipitation Retrieval with Multimodal Knowledge ExpansionCode0
Towards Unified Neural Decoding with Brain Functional Network Modeling0
Towards Scalable Schema Mapping using Large Language Models0
Multi-task Learning for Heterogeneous Data via Integrating Shared and Task-Specific Encodings0
Evaluating AI capabilities in detecting conspiracy theories on YouTubeCode0
Streamlining Knowledge Graph Creation with PyRML0
Towards a Spatiotemporal Fusion Approach to Precipitation NowcastingCode0
Control of Renewable Energy Communities using AI and Real-World Data0
Multimodal Generative AI for Story Point Estimation in Software Development0
A Cautionary Tale on Integrating Studies with Disparate Outcome Measures for Causal Inference0
Urban Representation Learning for Fine-grained Economic Mapping: A Semi-supervised Graph-based Approach0
TUM2TWIN: Introducing the Large-Scale Multimodal Urban Digital Twin Benchmark Dataset0
CrashSage: A Large Language Model-Centered Framework for Contextual and Interpretable Traffic Crash Analysis0
CDE-Mapper: Using Retrieval-Augmented Language Models for Linking Clinical Data Elements to Controlled Vocabularies0
Interpretable graph-based models on multimodal biomedical data integration: A technical review and benchmarking0
Multimodal Doctor-in-the-Loop: A Clinically-Guided Explainable Framework for Predicting Pathological Response in Non-Small Cell Lung Cancer0
Deep Multi-modal Breast Cancer Detection Network0
Leveraging Language Models for Automated Patient Record Linkage0
Generalized probabilistic canonical correlation analysis for multi-modal data integration with full or partial observationsCode0
Simplifying Data Integration: SLM-Driven Systems for Unified Semantic Queries Across Heterogeneous Databases0
OmniEcon Nexus: Global Microeconomic Simulation EngineCode0
The challenge of uncertainty quantification of large language models in medicine0
From Automation to Autonomy in Smart Manufacturing: A Bayesian Optimization Framework for Modeling Multi-Objective Experimentation and Sequential Decision Making0
Cross-Asset Risk Management: Integrating LLMs for Real-Time Monitoring of Equity, Fixed Income, and Currency Markets0
A Systematic Decade Review of Trip Route Planning with Travel Time Estimation based on User Preferences and Behavior0
Multimodal Data Integration for Sustainable Indoor Gardening: Tracking Anyplant with Time Series Foundation Model0
Multi-dataset and Transfer Learning Using Gene Expression Knowledge GraphsCode0
Federated Learning: A new frontier in the exploration of multi-institutional medical imaging data0
MODIS: Multi-Omics Data Integration for Small and Unpaired DatasetsCode0
GridMind: A Multi-Agent NLP Framework for Unified, Cross-Modal NFL Data Insights0
A Theoretical Framework for Graph-based Digital Twins for Supply Chain Management and Optimization0
Predicting Cardiopulmonary Exercise Testing Outcomes in Congenital Heart Disease Through Multi-modal Data Integration and Geometric Learning0
A Circular Construction Product Ontology for End-of-Life Decision-Making0
RetSTA: An LLM-Based Approach for Standardizing Clinical Fundus Image ReportsCode0
Representation Retrieval Learning for Heterogeneous Data Integration0
Hierarchical Cross-Modal Alignment for Open-Vocabulary 3D Object Detection0
Exploring LLM Agents for Cleaning Tabular Machine Learning Datasets0
Systematic Literature Review on Clinical Trial Eligibility Matching0
Can Large Language Models Unveil the Mysteries? An Exploration of Their Ability to Unlock Information in Complex Scenarios0
Intermediate triple table: A general architecture for virtual knowledge graphsCode0
Show:102550
← PrevPage 2 of 9Next →

No leaderboard results yet.