Is In-Context Universality Enough? MLPs are Also Universal In-Context
2025-02-05Unverified0· sign in to hype
Anastasis Kratsios, Takashi Furuya
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
The success of transformers is often linked to their ability to perform in-context learning. Recent work shows that transformers are universal in context, capable of approximating any real-valued continuous function of a context (a probability measure over X R^d) and a query x X. This raises the question: Does in-context universality explain their advantage over classical models? We answer this in the negative by proving that MLPs with trainable activation functions are also universal in-context. This suggests the transformer's success is likely due to other factors like inductive bias or training stability.