SOTAVerified

Analyzing limits for in-context learning

2025-02-05Unverified0· sign in to hype

Omar Naim, Nicholas Asher

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

We examine limits of in-context learning (ICL) in transformer models trained from scratch, focusing on function approximation tasks as a controlled setting to uncover fundamental behaviors. While we show empirically that transformer models can generalize, approximating unseen classes of polynomial (non linear) functions, they cannot generalize beyond certain values. We provide both empirical and mathematical arguments explaining that these limitations stem from architectural components, namely layer normalization and the attention scoring function, softmax. Together, our findings reveal structural constraints on ICL that are often masked in more complex NLP tasks but that need to be understood to improve robustness and interpretability in transformer-based models.

Tasks

Reproductions