SOTAVerified

Are Transformers Effective for Time Series Forecasting?

2022-05-26Code Available4· sign in to hype

Ailing Zeng, Muxi Chen, Lei Zhang, Qiang Xu

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

Recently, there has been a surge of Transformer-based solutions for the long-term time series forecasting (LTSF) task. Despite the growing performance over the past few years, we question the validity of this line of research in this work. Specifically, Transformers is arguably the most successful solution to extract the semantic correlations among the elements in a long sequence. However, in time series modeling, we are to extract the temporal relations in an ordered set of continuous points. While employing positional encoding and using tokens to embed sub-series in Transformers facilitate preserving some ordering information, the nature of the permutation-invariant self-attention mechanism inevitably results in temporal information loss. To validate our claim, we introduce a set of embarrassingly simple one-layer linear models named LTSF-Linear for comparison. Experimental results on nine real-life datasets show that LTSF-Linear surprisingly outperforms existing sophisticated Transformer-based LTSF models in all cases, and often by a large margin. Moreover, we conduct comprehensive empirical studies to explore the impacts of various design elements of LTSF models on their temporal relation extraction capability. We hope this surprising finding opens up new research directions for the LTSF task. We also advocate revisiting the validity of Transformer-based solutions for other time series analysis tasks (e.g., anomaly detection) in the future. Code is available at: https://github.com/cure-lab/LTSF-Linear.

Tasks

Benchmark Results

DatasetModelMetricClaimedVerifiedStatus
Electricity (192)DLinearMSE0.15Unverified
Electricity (336)DLinearMSE0.17Unverified
Electricity (720)DLinearMSE0.2Unverified
Electricity (96)DLinearMSE0.14Unverified
ETTh1 (192) MultivariateNLinearMSE0.41Unverified
ETTh1 (192) MultivariateDLinearMSE0.41Unverified
ETTh1 (192) UnivariateDLinearMSE0.07Unverified
ETTh1 (336) MultivariateDLinearMSE0.44Unverified
ETTh1 (336) MultivariateNLinearMSE0.43Unverified
ETTh1 (336) UnivariateDLinearMSE0.1Unverified
ETTh1 (336) UnivariateNLinearMSE0.08Unverified
ETTh1 (720) MultivariateNLinearMSE0.44Unverified
ETTh1 (720) MultivariateDLinearMSE0.47Unverified
ETTh1 (720) UnivariateDLinearMSE0.19Unverified
ETTh1 (720) UnivariateNLinearMSE0.08Unverified
ETTh1 (96) UnivariateDLinearMSE0.06Unverified
ETTh1 (96) UnivariateNLinearMSE0.05Unverified
ETTh2 (192) MultivariateNLinearMSE0.34Unverified
ETTh2 (192) MultivariateDLinearMSE0.38Unverified
ETTh2 (192) UnivariateNLinearMSE0.17Unverified
ETTh2 (192) UnivariateDLinearMSE0.18Unverified
ETTh2 (336) MultivariateDLinearMSE0.45Unverified
ETTh2 (336) MultivariateNLinearMSE0.36Unverified
ETTh2 (336) UnivariateDLinearMSE0.21Unverified
ETTh2 (336) UnivariateNLinearMSE0.19Unverified
ETTh2 (720) MultivariateNLinearMSE0.39Unverified
ETTh2 (720) MultivariateDLinearMSE0.61Unverified
ETTh2 (720) UnivariateNLinearMSE0.23Unverified
ETTh2 (720) UnivariateDLinearMSE0.28Unverified
ETTh2 (96) MultivariateNLinearMSE0.28Unverified
ETTh2 (96) MultivariateDLinearMSE0.29Unverified
ETTh2 (96) UnivariateNLinearMSE0.13Unverified
ETTh2 (96) UnivariateDLinearMSE0.13Unverified
Weather (192)DLinearMSE0.22Unverified
Weather (336)DLinearMSE0.27Unverified
Weather (720)DLinearMSE0.32Unverified
Weather (96)DLinearMSE0.18Unverified

Reproductions