Task Expansion and Cross Refinement for Open-World Conditional Modeling
Shreyas Bhat Brahmavar, Qiyang Liu, Yang Li, Junier Oliva
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
Open-world conditional modeling (OCM), requires a single model to answer arbitrary conditional queries across heterogeneous datasets, where observed variables and targets vary and arise from a vast open-ended task universe. Because any finite collection of real-world datasets covers only a small fraction of this space, we propose Task Expansion and Cross Refinement (TEXR), a semi-supervised framework that enlarges effective task coverage through structured synthesis and refinement of semantic data contexts. TEXR first generates diverse uninstantiated dataset schemas and weakly instantiates them via structured probabilistic generators guided by large language models. It then performs cross-model refinement by training on disjoint data partitions and revising synthetic values across splits to reduce confirmation bias and improve pseudo-value quality. The refined synthetic datasets are aggregated with real data to train a unified conditional model. Across heterogeneous tabular benchmarks, TEXR consistently improves zero-, few-, and many-shot performance for multiple OCM backbones, demonstrating that structured task expansion and cross refinement enhance open-world conditional modeling.