SOTAVerified

Data Integration

Data integration (also called information integration) is the process of consolidating data from a set of heterogeneous data sources into a single uniform data set (materialized integration) or view on the data (virtual integration). Data integration pipelines involve subtasks such as schema matching, table annotation, entity resolution, value normalization, data cleansing, and data fusion. Application domains of data integration include data warehousing, data lakes, and knowledge base consolidation. Surveys on Data integration:

Papers

Showing 251300 of 431 papers

TitleStatusHype
Ring-lite: Scalable Reasoning via C3PO-Stabilized Reinforcement Learning for LLMs0
Scalable Similarity Joins of Tokenized Strings0
Scaling Data Science Solutions with Semantics and Machine Learning: Bosch Case0
scICML: Information-theoretic Co-clustering-based Multi-view Learning for the Integrative Analysis of Single-cell Multi-omics data0
Secure and Differentially Private Bayesian Learning on Distributed Data0
Segment-based fusion of multi-sensor multi-scale satellite soil moisture retrievals0
Semantic Annotation for Tabular Data0
Semantic Data Management in Data Lakes0
Siamese Graph Neural Networks for Data Integration0
Simplifying Data Integration: SLM-Driven Systems for Unified Semantic Queries Across Heterogeneous Databases0
Skeleton Detection Using Dual Radars with Integration of Dual-View CNN Models and mmPose0
Smart City Digital Twin Framework for Real-Time Multi-Data Integration and Wide Public Distribution0
Specifying, Monitoring, and Executing Workflows in Linked Data Environments0
Statistical Agnostic Regression: a machine learning method to validate regression models0
Stochastic Biological System-of-Systems Modelling for iPSC Culture0
Stratified Data Integration0
Streamlining Knowledge Graph Creation with PyRML0
Structured Matrix Completion with Applications to Genomic Data Integration0
Supervised prediction of aging-related genes from a context-specific protein interaction subnetwork0
Survive the Schema Changes: Integration of Unmanaged Data Using Deep Learning0
Synthetic Poisoning Attacks: The Impact of Poisoned MRI Image on U-Net Brain Tumor Segmentation0
Systematic Literature Review on Clinical Trial Eligibility Matching0
TabulaTime: A Novel Multimodal Deep Learning Framework for Advancing Acute Coronary Syndrome Prediction through Environmental and Clinical Data Integration0
Targeted Data Fusion for Causal Survival Analysis Under Distribution Shift0
Targeting Underrepresented Populations in Precision Medicine: A Federated Transfer Learning Approach0
TCKAN:A Novel Integrated Network Model for Predicting Mortality Risk in Sepsis Patients0
Technical Report on Data Integration and Preparation0
TemporalAugmenter: An Ensemble Recurrent Based Deep Learning Approach for Signal Classification0
The challenge of uncertainty quantification of large language models in medicine0
The S2 Hierarchical Discrete Global Grid as a Nexus for Data Representation, Integration, and Querying Across Geospatial Knowledge Graphs0
Time Series Data Imputation: A Survey on Deep Learning Approaches0
Towards a Generic Multimodal Architecture for Batch and Streaming Big Data Integration0
Towards a Microservice-based Middleware for a Multi-hazard Early Warning System0
Towards a Modular Ontology for Space Weather Research0
Towards future directions in data-integrative supervised prediction of human aging-related genes0
Towards multiple kernel principal component analysis for integrative analysis of tumor samples0
Towards Scalable Schema Mapping using Large Language Models0
Towards Unified Neural Decoding with Brain Functional Network Modeling0
Transforming Social Science Research with Transfer Learning: Social Science Survey Data Integration with AI0
TUM2TWIN: Introducing the Large-Scale Multimodal Urban Digital Twin Benchmark Dataset0
TweetsKB: A Public and Large-Scale RDF Corpus of Annotated Tweets0
Uncertainty in Automated Ontology Matching: Lessons Learned from an Empirical Experimentation0
Understanding Reflection Needs for Personal Health Data in Diabetes0
Unified Representation of Genomic and Biomedical Concepts through Multi-Task, Multi-Source Contrastive Learning0
Unlocking Historical Clinical Trial Data with ALIGN: A Compositional Large Language Model System for Medical Coding0
Update hydrological states or meteorological forcings? Comparing data assimilation methods for differentiable hydrologic models0
Urban Representation Learning for Fine-grained Economic Mapping: A Semi-supervised Graph-based Approach0
Reasoning about disclosure in data integration in the presence of source constraints0
When Geoscience Meets Foundation Models: Towards General Geoscience Artificial Intelligence System0
VITAL: Interactive Few-Shot Imitation Learning via Visual Human-in-the-Loop Corrections0
Show:102550
← PrevPage 6 of 9Next →

No leaderboard results yet.