The Phasor Transformer: Resolving Attention Bottlenecks on the Unit Circle
Dibakar Sigdel
Unverified — Be the first to reproduce this paper.
ReproduceAbstract
Transformer models have redefined sequence learning, yet dot-product self-attention introduces a quadratic token-mixing bottleneck for long-context time-series. We introduce the Phasor Transformer block, a phase-native alternative representing sequence states on the unit-circle manifold S^1. Each block combines lightweight trainable phase-shifts with parameter-free Discrete Fourier Transform (DFT) token coupling, achieving global O(N N) mixing without explicit attention maps. Stacking these blocks defines the Large Phasor Model (LPM). We validate LPM on autoregressive time-series prediction over synthetic multi-frequency benchmarks. Operating with a highly compact parameter budget, LPM learns stable global dynamics and achieves competitive forecasting behavior compared to conventional self-attention baselines. Our results establish an explicit efficiency-performance frontier, demonstrating that large-model scaling for time-series can emerge from geometry-constrained phase computation with deterministic global coupling, offering a practical path toward scalable temporal modeling in oscillatory domains.