CAPS: Unifying Attention, Recurrence, and Alignment in Transformer-based Time Series Forecasting

2026-02-02Code Available0· sign in to hype

Viresh Pati, Yubin Kim, Vinh Pham, Jevon Twitty, Shihao Yang, Jiecheng Lu

Code Available — Be the first to reproduce this paper.

Code

github.com/vireshpati/caps-attention
OfficialIn paper★ 0

Abstract

This paper presents CAPS (Clock-weighted Aggregation with Prefix-products and Softmax), a structured attention mechanism for time series forecasting that decouples three distinct temporal structures: global trends, local shocks, and seasonal patterns. Standard softmax attention entangles these through global normalization, while recent recurrent models sacrifice long-term, order-independent selection for order-dependent causal structure. CAPS combines SO(2) rotations for phase alignment with three additive gating paths -- Riemann softmax, prefix-product gates, and a Clock baseline -- within a single attention layer. We introduce the Clock mechanism, a learned temporal weighting that modulates these paths through a shared notion of temporal importance. Experiments on long- and short-term forecasting benchmarks surpass vanilla softmax and linear attention mechanisms and demonstrate competitive performance against seven strong baselines with linear complexity. Our code implementation is available at https://github.com/vireshpati/CAPS-Attention.

CAPS: Unifying Attention, Recurrence, and Alignment in Transformer-based Time Series Forecasting

Code

Abstract

Reproductions