Eliciting Fine-Tuned Transformer Capabilities via Inference-Time Techniques
Asankhaya Sharma
Code Available — Be the first to reproduce this paper.
ReproduceCode
- github.com/codelion/optillmOfficialIn paperpytorch★ 3,389
Abstract
Large language models have transformed natural language processing, yet supervised fine-tuning (SFT) remains computationally intensive. This paper formally proves that capabilities acquired through SFT can be approximated by a base transformer model using inference-time techniques, specifically in-context learning (ICL), without altering model parameters, under idealized assumptions including unbounded computational resources and access to the fine-tuning dataset. We extend these results to practical scenarios with finite context lengths and partial dataset access. For text generation tasks with fixed output length l, datasets of size O( m V^2 m ) or, with bounded context, O( l V^2 1 ) suffice to approximate fine-tuned behavior across m contexts within error , where V is the vocabulary size and is the failure probability. For linear classification, datasets of size O( d ) or, with fixed context, O( 1^2 1 ) are sufficient, where d is the input dimension. Grounded in the Turing completeness of transformers, these results provide a theoretical foundation for resource-efficient deployment of large language models, with practical techniques like retrieval-augmented generation bridging theory to real-world applications.