SOTAVerified

Decoding-based Regression

2025-01-31Code Available3· sign in to hype

Xingyou Song, Dara Bahri

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Language models have recently been shown capable of performing regression tasks wherein numeric predictions are represented as decoded strings. In this work, we provide theoretical grounds for this capability and furthermore investigate the utility of causal auto-regressive sequence models when they are applied to any feature representation. We find that, despite being trained in the usual way - for next-token prediction via cross-entropy loss - decoding-based regression is as performant as traditional approaches for tabular regression tasks, while being flexible enough to capture arbitrary distributions, such as in the task of density estimation.

Tasks

Reproductions