Skip to main content

Few-Shot Forecasting of Time-Series with Heterogeneous Channels

  • Conference paper
  • First Online:
Machine Learning and Knowledge Discovery in Databases (ECML PKDD 2022)

Abstract

Learning complex time series forecasting models usually requires a large amount of data, as each model is trained from scratch for each task/data set. Leveraging learning experience with similar datasets is a well-established technique for classification problems called few-shot classification. However, existing approaches cannot be applied to time-series forecasting because i) multivariate time-series datasets have different channels, and ii) forecasting is principally different from classification. In this paper, we formalize the problem of few-shot forecasting of time-series with heterogeneous channels for the first time. Extending recent work on heterogeneous attributes in vector data, we develop a model composed of permutation-invariant deep set-blocks which incorporate a temporal embedding. We assemble the first meta-dataset of 40 multivariate time-series datasets and show through experiments that our model provides a good generalization, outperforming baselines carried over from simpler scenarios that either fail to learn across tasks or miss temporal information.

L. Brinkmeyer and R. Drumond—Equal contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Arango, S.P., Heinrich, F., Madhusudhanan, K., Schmidt-Thieme, L.: Multimodal meta-learning for time series regression. In: Lemaire, V., Malinowski, S., Bagnall, A., Guyet, T., Tavenard, R., Ifrim, G. (eds.) AALTD 2021. LNCS (LNAI), vol. 13114, pp. 123–138. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-91445-5_8

    Chapter  Google Scholar 

  2. Bagnall, A., et al.: The UEA multivariate time series classification archive (2018). arXiv preprint arXiv:1811.00075 (2018)

  3. Box, G.E.P., Jenkins, G.M.: Some recent advances in forecasting and control. J. R. Stat. Soc. Ser. C (Applied Statistics) 17(2), 91–109 (1968). http://www.jstor.org/stable/2985674

  4. Brinkmeyer, L., Drumond, R.R., Scholz, R., Grabocka, J., Schmidt-Thieme, L.: Chameleon: learning model initializations across tasks with different schemas. arXiv preprint arXiv:1909.13576 (2019)

  5. Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1251–1258 (2017)

    Google Scholar 

  6. Dai, Z., Liu, H., Le, Q., Tan, M.: Coatnet: marrying convolution and attention for all data sizes. Advances in Neural IHou, R., Chang, H., Ma, B., Shan, S., and Chen, X. (2019). Cross attention network for few-shot classification. Advances in Neural Information Processing Systems, 32. Information Processing Systems 34 (2021)

    Google Scholar 

  7. Dau, H.A., et al.: The UCR time series archive. IEEE/CAA J. Automatica Sinica 6(6), 1293–1305 (2019)

    Article  Google Scholar 

  8. Deng, J., et al.: Imagenet: A large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)

    Google Scholar 

  9. Drumond, R.R., Brinkmeyer, L., Grabocka, J., Schmidt-Thieme, L.: Hidra: Head initialization across dynamic targets for robust architectures. In: Proceedings of the 2020 SIAM International Conference on Data Mining, pp. 397–405. SIAM (2020). https://epubs.siam.org/doi/abs/10.1137/1.9781611976236.45

  10. Drumond, R.R., Marques, B.A., Vasconcelos, C.N., Clua, E.: Peek-an lstm recurrent network for motion classification from sparse data. In: VISIGRAPP (1: GRAPP), pp. 215–222 (2018)

    Google Scholar 

  11. Feurer, M., Springenberg, J., Hutter, F.: Initializing bayesian hyperparameter optimization via meta-learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 29 (2015)

    Google Scholar 

  12. Finn, C., Abbeel, P., Levine, S.: Model-agnostic meta-learning for fast adaptation of deep networks. In: International Conference on Machine Learning, pp. 1126–1135. PMLR (2017)

    Google Scholar 

  13. Foret, P., Kleiner, A., Mobahi, H., Neyshabur, B.: Sharpness-aware minimization for efficiently improving generalization. arXiv preprint arXiv:2010.01412 (2020)

  14. Godahewa, R., Bergmeir, C., Webb, G.I., Hyndman, R.J., Montero-Manso, P.: Monash time series forecasting archive. arXiv preprint arXiv:2105.06643 (2021)

  15. Gupta, A., Mendonca, R., Liu, Y., Abbeel, P., Levine, S.: Meta-reinforcement learning of structured exploration strategies. In: Advances in Neural Information Processing Systems 31 (2018)

    Google Scholar 

  16. Hinton, G., Vinyals, O., Dean, J., et al.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 2(7) (2015)

  17. Hou, R., Chang, H., Ma, B., Shan, S., Chen, X.: Cross attention network for few-shot classification. In: Advances in Neural Information Processing Systems, vol. 32 (2019)

    Google Scholar 

  18. Iwata, T., Kumagai, A.: Few-shot learning for time-series forecasting. arXiv preprint arXiv:2009.14379 (2020)

  19. Iwata, T., Kumagai, A.: Meta-learning from tasks with heterogeneous attribute spaces. Adv. Neural Inf. Process. Syst. 33, 6053–6063 (2020)

    Google Scholar 

  20. Jawed, S., Jomaa, H., Schmidt-Thieme, L., Grabocka, J.: Multi-task learning curve forecasting across hyperparameter configurations and datasets. In: Oliver, N., Pérez-Cruz, F., Kramer, S., Read, J., Lozano, J.A. (eds.) ECML PKDD 2021. LNCS (LNAI), vol. 12975, pp. 485–501. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86486-6_30

    Chapter  Google Scholar 

  21. Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  22. Kitaev, N., Kaiser, Ł., Levskaya, A.: Reformer: the efficient transformer. arXiv preprint arXiv:2001.04451 (2020)

  23. Krollner, B., Vanstone, B.J., Finnie, G.R., et al.: Financial time series forecasting with machine learning techniques: a survey. In: ESANN (2010)

    Google Scholar 

  24. Lim, B., Zohren, S.: Time-series forecasting with deep learning: a survey. Philos. Trans. R. Society A 379(2194), 20200209 (2021)

    Article  MathSciNet  Google Scholar 

  25. Liu, H., Simonyan, K., Yang, Y.: Darts: Differentiable architecture search. arXiv preprint arXiv:1806.09055 (2018)

  26. Liu, M., Zeng, A., Xu, Z., Lai, Q., Xu, Q.: Time series is a special sequence: forecasting with sample convolution and interaction. arXiv preprint arXiv:2106.09305 (2021)

  27. Madhusudhanan, K., Burchert, J., Duong-Trung, N., Born, S., Schmidt-Thieme, L.: Yformer: u-net inspired transformer architecture for far horizon time series forecasting. arXiv preprint arXiv:2110.08255 (2021)

  28. Makridakis, S., Spiliotis, E., Assimakopoulos, V.: The m4 competition: results, findings, conclusion and way forward. Int. J. Forecast. 34(4), 802–808 (2018)

    Article  Google Scholar 

  29. Makridakis, S., Spiliotis, E., Assimakopoulos, V.: Statistical and machine learning forecasting methods: concerns and ways forward. PLoS ONE 13(3), e0194889 (2018)

    Article  Google Scholar 

  30. Maron, H., Ben-Hamu, H., Shamir, N., Lipman, Y.: Invariant and equivariant graph networks. arXiv preprint arXiv:1812.09902 (2018)

  31. Munkhdalai, T., Yu, H.: Meta networks. In: International Conference on Machine Learning, pp. 2554–2563. PMLR (2017)

    Google Scholar 

  32. Narwariya, J., Malhotra, P., Vig, L., Shroff, G., Vishnu, T.: Meta-learning for few-shot time series classification. In: Proceedings of the 7th ACM IKDD CoDS and 25th COMAD, pp. 28–36 (2020)

    Google Scholar 

  33. Nichol, A., Achiam, J., Schulman, J.: On first-order meta-learning algorithms. arXiv preprint arXiv:1803.02999 (2018)

  34. Oh, J., Yoo, H., Kim, C., Yun, S.Y.: BOIL: towards representation change for few-shot learning. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=umIdUL8rMH

  35. Oliveira, E.M.: Quality prediction in a mining process. https://www.kaggle.com/datasets/shasun/tool-wear-detection-in-cnc-mill?select=README.txt

  36. Oreshkin, B.N., Carpov, D., Chapados, N., Bengio, Y.: Meta-learning framework with applications to zero-shot time-series forecasting. arXiv preprint arXiv:2002.02887 (2020)

  37. Oreshkin, B.N., Carpov, D., Chapados, N., Bengio, Y.: N-beats: Neural basis expansion analysis for interpretable time series forecasting. In: International Conference on Learning Representations (2020). https://openreview.net/forum?id=r1ecqn4YwB

  38. Qin, Y., Song, D., Chen, H., Cheng, W., Jiang, G., Cottrell, G.: A dual-stage attention-based recurrent neural network for time series prediction. arXiv preprint arXiv:1704.02971 (2017)

  39. Rangapuram, S.S., Seeger, M.W., Gasthaus, J., Stella, L., Wang, Y., Januschowski, T.: Deep state space models for time series forecasting. In: Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R. (eds.) Advances in Neural Information Processing Systems. vol. 31. Curran Associates, Inc. (2018). https://proceedings.neurips.cc/paper/2018/file/5cf68969fb67aa6082363a6d4e6468e2-Paper.pdf

  40. Rusu, A.A., Rao, D., Sygnowski, J., Vinyals, O., Pascanu, R., Osindero, S., Hadsell, R.: Meta-learning with latent embedding optimization. arXiv preprint arXiv:1807.05960 (2018)

  41. Salinas, D., Flunkert, V., Gasthaus, J., Januschowski, T.: Deepar: probabilistic forecasting with autoregressive recurrent networks. Int. J. Forecast. 36(3), 1181–1191 (2020)

    Article  Google Scholar 

  42. Santoro, A., Bartunov, S., Botvinick, M., Wierstra, D., Lillicrap, T.: Meta-learning with memory-augmented neural networks. In: International Conference on Machine Learning, pp. 1842–1850. PMLR (2016)

    Google Scholar 

  43. Smyl, S.: A hybrid method of exponential smoothing and recurrent neural networks for time series forecasting. Int. J. Forecast. 36(1), 75–85 (2020)

    Article  Google Scholar 

  44. Snell, J., Swersky, K., Zemel, R.: Prototypical networks for few-shot learning. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

    Google Scholar 

  45. Sun, S.: Cnc mill tool wear. https://www.kaggle.com/datasets/shasun/tool-wear-detection-in-cnc-mill?select=README.txt

  46. Tang, W., Liu, L., Long, G.: Interpretable time-series classification on few-shot samples. In: 2020 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2020)

    Google Scholar 

  47. Tolstikhin, I.O., et al.: Mlp-mixer: an all-mlp architecture for vision. In: Advances in Neural Information Processing Systems (2021)

    Google Scholar 

  48. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)

    Google Scholar 

  49. Vinyals, O., Blundell, C., Lillicrap, T., Wierstra, D., et al.: Matching networks for one shot learning. In: Advances in Neural Information Processing Systems, vol. 29 (2016)

    Google Scholar 

  50. Von Birgelen, A., Buratti, D., Mager, J., Niggemann, O.: Self-organizing maps for anomaly localization and predictive maintenance in cyber-physical production systems. Procedia cirp 72, 480–485 (2018)

    Article  Google Scholar 

  51. Wang, Y., Yao, Q., Kwok, J.T., Ni, L.M.: Generalizing from a few examples: a survey on few-shot learning. ACM Comput. Surv. (CSUR) 53(3), 1–34 (2020)

    Article  Google Scholar 

  52. Wang, Z., Yan, W., Oates, T.: Time series classification from scratch with deep neural networks: a strong baseline. In: 2017 International joint conference on neural networks (IJCNN), pp. 1578–1585. IEEE (2017)

    Google Scholar 

  53. Zaheer, M., Kottur, S., Ravanbakhsh, S., Poczos, B., Salakhutdinov, R.R., Smola, A.J.: Deep sets. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

    Google Scholar 

  54. Zhou, H., et al.: Informer: beyond efficient transformer for long sequence time-series forecasting. In: Proceedings of AAAI (2021)

    Google Scholar 

Download references

Acknowledgements

This work was supported by the Federal Ministry for Economic Affairs and Climate Action (BMWK), Germany, within the framework of the IIP-Ecosphere project (project number: 01MK20006D).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Lukas Brinkmeyer or Rafael Rego Drumond .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 106 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Brinkmeyer, L., Drumond, R.R., Burchert, J., Schmidt-Thieme, L. (2023). Few-Shot Forecasting of Time-Series with Heterogeneous Channels. In: Amini, MR., Canu, S., Fischer, A., Guns, T., Kralj Novak, P., Tsoumakas, G. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2022. Lecture Notes in Computer Science(), vol 13718. Springer, Cham. https://doi.org/10.1007/978-3-031-26422-1_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-26422-1_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-26421-4

  • Online ISBN: 978-3-031-26422-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics