In-Context Learning of Linear Systems: Generalization Theory and Applications to Operator Learning

2024-09-18Code Available0· sign in to hype

Frank Cole, Yulong Lu, Wuzhe Xu, Tianhao Zhang

Code Available — Be the first to reproduce this paper.

Code

github.com/lugroupumn/icl-ellipticpdes
OfficialIn paperpytorch★ 1
github.com/lugroupumn/icl_linear_systems
OfficialIn papernone★ 0

Abstract

We study theoretical guarantees for solving linear systems in-context using a linear transformer architecture. For in-domain generalization, we provide neural scaling laws that bound the generalization error in terms of the number of tasks and sizes of samples used in training and inference. For out-of-domain generalization, we find that the behavior of trained transformers under task distribution shifts depends crucially on the distribution of the tasks seen during training. We introduce a novel notion of task diversity and show that it defines a necessary and sufficient condition for pre-trained transformers generalize under task distribution shifts. We also explore applications of learning linear systems in-context, such as to in-context operator learning for PDEs. Finally, we provide some numerical experiments to validate the established theory.

Tasks

Diversity Domain Generalization In-Context Learning Operator learning

In-Context Learning of Linear Systems: Generalization Theory and Applications to Operator Learning

Code

Abstract

Tasks

Reproductions