Double-Layer Attention for Long Sequence Time-Series Forecasting

Ma, Jiasheng; Wang, Xiaoye; Xiao, Yingyuan

doi:10.1007/978-3-031-39821-6_19

Jiasheng Ma¹²,
Xiaoye Wang¹² &
Yingyuan Xiao¹²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14147))

Included in the following conference series:

International Conference on Database and Expert Systems Applications

428 Accesses

Abstract

Time series forecasting (TSF) is crucial in many real-world applications. This paper studies the long-term forecasting problem of time series. Recent research has demonstrated that Transformer-based forecasting models can enhance forecasting accuracy, but their computational demands present a significant challenge for Long Sequence Time-series Forecasting (LSTF). To mitigate this, some researchers propose using a sparse attention network to reduce computational costs, but this approach can result in low information utilization and hinder long-term forecasting performance. This limitation impacts the overall effectiveness of the forecasting model. To address this issue, a new approach called Double-layer Efficient ProbSparse self-attention (DEPformer) is proposed in this paper for Long Sequence Time-series Forecasting. It combines a sparse attention network with an attention network that extracts global context vectors. This approach improves upon the low information utilization of sparse attention alone and enhances long-term forecasting performance. Experiments using standard and real datasets show that DEPformer outperforms the previous mainstream models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Box, G.E., Jenkins, G.M.: Some recent advances in forecasting and control. J. Roy. Stat. Soc. Ser. C (Appl. Stat.) 17(2), 91–109 (1968)
MathSciNet Google Scholar
Smola, A.J., Schölkopf, B.: A tutorial on support vector regression. Stat. Comput. 14, 199–222 (2004)
Article MathSciNet Google Scholar
Holt, C.C.: Forecasting seasonals and trends by exponentially weighted moving averages. Int. J. Forecast. 20(1), 5–10 (2004)
Article Google Scholar
Box, G.E., Jenkins, G.M.: Time Series Analysis: Forecasting and Control. Holden-Day, San Francisco (1976)
MATH Google Scholar
Gardner, E.S., Jr.: Exponential smoothing: the state of the art. J. Forecast. 4(1), 1–28 (1985)
Article Google Scholar
Winters, P.R.: Forecasting sales by exponentially weighted moving averages. Manag. Sci. 6(3), 324–342 (1960)
Article MathSciNet MATH Google Scholar
Ahmed, N.K., Atiya, A.F., Gayar, N.E., El-Shishiny, H.: An empirical comparison of machine learning models for time series forecasting. Econ. Rev. 29(5–6), 594–621 (2010)
Article MathSciNet Google Scholar
Zhou, H., Zhang, S., Peng, J., Zhang, S., Li, J., Xiong, H., Zhang, W.: Informer: beyond efficient transformer for long sequence time-series forecasting. In: Thirty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2021, Thirty-Third Conference on Innovative Applications of Artificial Intelligence, IAAI 2021, The Eleventh Symposium on Educational Advances in Artificial Intelligence, EAAI 2021, Virtual Event, 2–9 February 2021, pp. 11106–11115. AAAI Press (2021). https://ojs.aaai.org/index.php/AAAI/article/view/17325
Box, G.E., Jenkins, G.M., Reinsel, G.C., Ljung, G.M.: Time Series Analysis: Forecasting and Control. John Wiley & Sons, Hoboken (2015)
MATH Google Scholar
Ray, W.: Time Series: Theory and Methods (1990)
Google Scholar
Wen, R., Torkkola, K., Narayanaswamy, B., Madeka, D.: A multi-horizon quantile recurrent forecaster (2017). arXiv preprint arXiv:1711.11053
Rangapuram, S.S., Seeger, M.W., Gasthaus, J., Stella, L., Wang, Y., Januschowski, T.: Deep state space models for time series forecasting. Adv. Neural Inf. Process. Syst. 31, 1–10 (2018)
Google Scholar
Yu, R., Zheng, S., Anandkumar, A., Yue, Y.: Long-term forecasting using higher order tensor rnns (2017). arXiv preprint arXiv:1711.00073
Maddix, D.C., Wang, Y., Smola, A.: Deep factors with gaussian processes for forecasting (2018). arXiv preprint arXiv:1812.00098
Salinas, D., Flunkert, V., Gasthaus, J., Januschowski, T.: Deepar: probabilistic forecasting with autoregressive recurrent networks. Int. J. Forecast. 36(3), 1181–1191 (2020)
Article Google Scholar
Lai, G., Chang, W.C., Yang, Y., Liu, H.: Modeling long-and short-term temporal patterns with deep neural networks. In: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, pp. 95–104 (2018)
Google Scholar
Qin, Y., Song, D., Chen, H., Cheng, W., Jiang, G., Cottrell, G.: A dual-stage attention-based recurrent neural network for time series prediction (2017). arXiv preprint arXiv:1704.02971
Shih, S.Y., Sun, F.K., Lee, H.Y.: Temporal pattern attention for multivariate time series forecasting. Mach. Learn. 108, 1421–1441 (2019)
Article MathSciNet MATH Google Scholar
Song, H., Rajan, D., Thiagarajan, J., Spanias, A.: Attend and diagnose: clinical time series analysis using attention models. In: Proceedings of the AAAI Conference on Artificial Intelligence (2018)
Google Scholar
Borovykh, A., Bohte, S., Oosterlee, C.W.: Conditional time series forecasting with convolutional neural networks (2017). arXiv preprint arXiv:1703.04691
Bai, S., Kolter, J.Z., Koltun, V.: An empirical evaluation of generic convolutional and recurrent networks for sequence modeling (2018). arXiv preprint arXiv:1803.01271
Sen, R., Yu, H.F., Dhillon, I.S.: Think globally, act locally: a deep neural network approach to high-dimensional time series forecasting. Adv. Neural Inf. Process. Syst. 32, 1–10 (2019)
Google Scholar
Li, S., et al.: Enhancing the locality and breaking the memory bottleneck of transformer on time series forecasting. Adv. Neural Inf. Process. Syst. 32, 1–11 (2019)
Google Scholar
Shen, Z., Zhang, M., Zhao, H., Yi, S., Li, H.: Efficient attention: attention with linear complexities. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 3531–3539 (2021)
Google Scholar
Kitaev, N., Kaiser, Ł., Levskaya, A.: Reformer: the efficient transformer (2020). arXiv preprint arXiv:2001.04451
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate (2014). arXiv preprint arXiv:1409.0473
Taylor, S.J., Letham, B.: Forecasting at scale. Am. Stat. 72(1), 37–45 (2018)
Article MathSciNet MATH Google Scholar
Ariyo, A.A., Adewumi, A.O., Ayo, C.K.: Stock price prediction using the arima model. In: 2014 UKSim-AMSS 16th International Conference on Computer Modelling and Simulation, pp. 106–112. IEEE (2014)
Google Scholar

Download references

Acknowledgments

This work is supported by “Tianjin Project + Team" Key Training Project under Grant No. XC202022

Author information

Authors and Affiliations

School of Computer Science and Engineering, Tianjin University of Technology, Tianjin, 300384, China
Jiasheng Ma, Xiaoye Wang & Yingyuan Xiao

Authors

Jiasheng Ma
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoye Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yingyuan Xiao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaoye Wang .

Editor information

Editors and Affiliations

University of Vienna, Vienna, Austria
Christine Strauss
University of Tsukuba, Ibaraki, Japan
Toshiyuki Amagasa
Johannes Kepler University Linz, Linz, Austria
Gabriele Kotsis
Vienna University of Technology, Vienna, Austria
A Min Tjoa
Johannes Kepler University Linz, Linz, Austria
Ismail Khalil

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ma, J., Wang, X., Xiao, Y. (2023). Double-Layer Attention for Long Sequence Time-Series Forecasting. In: Strauss, C., Amagasa, T., Kotsis, G., Tjoa, A.M., Khalil, I. (eds) Database and Expert Systems Applications. DEXA 2023. Lecture Notes in Computer Science, vol 14147. Springer, Cham. https://doi.org/10.1007/978-3-031-39821-6_19

Download citation

DOI: https://doi.org/10.1007/978-3-031-39821-6_19
Published: 16 August 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-39820-9
Online ISBN: 978-3-031-39821-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Double-Layer Attention for Long Sequence Time-Series Forecasting