Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

2020-12-14Code Available1· sign in to hype

Haoyi Zhou, Shanghang Zhang, Jieqi Peng, Shuai Zhang, JianXin Li, Hui Xiong, Wancai Zhang

Code Available — Be the first to reproduce this paper.

Code

github.com/zhouhaoyi/Informer2020
In paperpytorch★ 6,453
github.com/zhouhaoyi/ETDataset
In papernone★ 929
github.com/larsbentsen/fftransformer
pytorch★ 87
github.com/AndrzejMiskow/TradeAI
pytorch★ 57
github.com/martinwhl/Informer-PyTorch-Lightning
pytorch★ 35
github.com/tianhai123/Informer-Tensorflow
tf★ 30
github.com/hubtru/LTBoost
pytorch★ 17
github.com/MindSpore-scientific/code-12/tree/main/CS-F-LTR
mindspore★ 0
github.com/pwc-1/Paper-9/tree/main/6/CS-F-LTR/src
mindspore★ 0
github.com/MindCode-4/code-14/tree/main/CS-F-LTR
mindspore★ 0

Abstract

Many real-world applications require the prediction of long sequence time-series, such as electricity consumption planning. Long sequence time-series forecasting (LSTF) demands a high prediction capacity of the model, which is the ability to capture precise long-range dependency coupling between output and input efficiently. Recent studies have shown the potential of Transformer to increase the prediction capacity. However, there are several severe issues with Transformer that prevent it from being directly applicable to LSTF, including quadratic time complexity, high memory usage, and inherent limitation of the encoder-decoder architecture. To address these issues, we design an efficient transformer-based model for LSTF, named Informer, with three distinctive characteristics: (i) a ProbSparse self-attention mechanism, which achieves O(L L) in time complexity and memory usage, and has comparable performance on sequences' dependency alignment. (ii) the self-attention distilling highlights dominating attention by halving cascading layer input, and efficiently handles extreme long input sequences. (iii) the generative style decoder, while conceptually simple, predicts the long time-series sequences at one forward operation rather than a step-by-step way, which drastically improves the inference speed of long-sequence predictions. Extensive experiments on four large-scale datasets demonstrate that Informer significantly outperforms existing methods and provides a new solution to the LSTF problem.

Tasks

Decoder Multivariate Time Series Forecasting Time Series Time Series Analysis Time Series Forecasting Univariate Time Series Forecasting

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
ETTh1 (168) Multivariate	Informer	MSE	0.88	—	Unverified
ETTh1 (168) Univariate	Informer	MSE	0.18	—	Unverified
ETTh1 (24) Multivariate	Informer	MSE	0.51	—	Unverified
ETTh1 (24) Univariate	Informer	MSE	0.05	—	Unverified
ETTh1 (336) Multivariate	Informer	MSE	0.88	—	Unverified
ETTh1 (336) Univariate	Informer	MSE	0.19	—	Unverified
ETTh1 (48) Multivariate	Informer	MSE	0.55	—	Unverified
ETTh1 (48) Univariate	Informer	MSE	0.13	—	Unverified
ETTh1 (720) Multivariate	Informer	MSE	0.94	—	Unverified
ETTh1 (720) Univariate	Informer	MSE	0.2	—	Unverified
ETTh2 (168) Multivariate	Informer	MSE	1.51	—	Unverified
ETTh2 (168) Univariate	Informer	MSE	0.15	—	Unverified
ETTh2 (24) Multivariate	Informer	MSE	0.45	—	Unverified
ETTh2 (24) Univariate	Informer	MSE	0.08	—	Unverified
ETTh2 (336) Multivariate	Informer	MSE	1.67	—	Unverified
ETTh2 (336) Univariate	Informer	MSE	0.17	—	Unverified
ETTh2 (48) Multivariate	Informer	MSE	0.93	—	Unverified
ETTh2 (48) Univariate	Informer	MSE	0.11	—	Unverified
ETTh2 (720) Multivariate	Informer	MSE	2.34	—	Unverified
ETTh2 (720) Univariate	Informer	MSE	0.18	—	Unverified

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

Code

Abstract

Tasks

Benchmark Results

Reproductions