Skip to main content
Log in

Error-space representations for multi-dimensional data streams with temporal dependence

  • Short Paper
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

In many application scenarios, data points are not only temporally dependent, but also expected in the form of a fast-moving stream. A broad selection of efficient learning algorithms exists which may be applied to data streams, but they typically do not take into account the temporal nature of the data. We motivate and design a method which creates an efficient representation of a data stream, where temporal information is embedded into each instance via the error space of forecasting models. Unlike many other methods in the literature, our approach can be rapidly initialized and does not require iterations over the full data sequence, thus it is suitable for a streaming scenario. This allows the application of off-the-shelf data-stream methods, depending on the application domain. In this paper, we investigate classification. We compare to a large variety of methods (auto-encoders, HMMs, basis functions, clustering methodologies, and PCA) and find that our proposed methods perform very competitively, and offers much promise for future work.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Notes

  1. i.e. \(x_t := {\mathbf {x}}_t\) here.

  2. In relation to the window \(\tau \) considered by our method, then \(W = \tau + 1\).

  3. The distinction between clustering and latent variables is of course not always necessary.

  4. Our code has been made available in the Scikit-MultiFlow framework [23].

  5. http://scikit-learn.org.

  6. https://github.com/hmmlearn/hmmlearn.

  7. http://moa.cms.waikato.ac.nz/datasets/.

References

  1. Matsubara Y, Sakurai Y, Faloutsos C (2014) Autoplait: automatic mining of co-evolving time sequences. In: Proceedings of the 2014 ACM SIGMOD international conference on management of data, ser. SIGMOD ’14, pp 193–204. ACM, New York, NY, USA. https://doi.org/10.1145/2588555.2588556

  2. Barber D (2012) Bayesian reasoning and machine learning. Cambridge University Press, Cambridge

    MATH  Google Scholar 

  3. Sutton RS, Barto AG (1998) Introduction to reinforcement learning. MIT Press, Cambridge

    Book  MATH  Google Scholar 

  4. Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828. https://doi.org/10.1109/TPAMI.2013.50

    Article  Google Scholar 

  5. Bifet A, Holmes G, Pfahringer B, Gavaldà R (2009) Improving adaptive bagging methods for evolving data streams. In: Asian conference on machine learning

  6. Gama J (2010) Knowledge discovery from data streams. Chapman & Hall/CRC, Boca Raton

    Book  MATH  Google Scholar 

  7. Bifet A, Holmes G, Kirkby R, Pfahringer B (2010) MOA: massive online analysis. J Mach Learn Res (JMLR) 11:1601–1604

    Google Scholar 

  8. Žliobaitė I, Bifet A, Read J, Pfahringer B, Holmes G (2014) Evaluation methods and decision theory for classification of streaming data with temporal dependence. Mach Learn 98(3):455–482

    MathSciNet  MATH  Google Scholar 

  9. Hollmén J, Tresp V (1999) Call-based fraud detection in mobile communications networks using a hierarchical regime-switching model. In: Proceedings of the 1998 conference advances in neural information processing systems II (NIPS’11), pp 889–895

  10. Zafeiriou L, Nicolaou MA, Zafeiriou S, Nikitidis S, Pantic M (2016) Probabilistic slow features for behavior analysis. IEEE Trans Neural Netw Learn Syst 27(5):1034–1048. https://doi.org/10.1109/TNNLS.2015.2435653

    Article  MathSciNet  Google Scholar 

  11. Lughofer E, Weigl E, Heidl W, Eitzinger C, Radauer T (2016) Recognizing input space and target concept drifts in data streams with scarcely labeled and unlabelled instances. Inf Sci 355(C):127–151. https://doi.org/10.1016/j.ins.2016.03.034

    Article  Google Scholar 

  12. Gama J, Žliobaitė I, Bifet A, Pechenizkiy M, Bouchachia A (2014) A survey on concept drift adaptation. ACM Comput Surv 46(4):44:1–44:37. https://doi.org/10.1145/2523813

    Article  MATH  Google Scholar 

  13. Tilo S (2016) Data fitting and uncertainty. Springer, Berlin

    Google Scholar 

  14. Gomes HM, Bifet A, Read J, Barddal JP, Enembreck F, Pfharinger B, Holmes G, Abdessalem T (2017) Adaptive random forests for evolving data stream classification. Mach Learn 106(9–10):1469–1495. https://doi.org/10.1007/s10994-017-5642-8

    Article  MathSciNet  MATH  Google Scholar 

  15. Romeu P, Zamora-Martínez F, Botella-Rocamora P, Pardo J (2015) Stacked denoising auto-encoders for short-term time series forecasting. Springer, Cham, pp 463–486. https://doi.org/10.1007/978-3-319-09903-3_23

    Google Scholar 

  16. Hoffman MD, Blei DM, Wang C, Paisley J (2013) Stochastic variational inference. J Mach Learn Res (JMLR) 14:1303–1347

    MathSciNet  MATH  Google Scholar 

  17. Read J, Perez-Cruz F, Bifet A (2015) Deep learning in multi-label data-streams. In: SAC 2015: 30th ACM symposium on applied computing. ACM

  18. Oates T, Firoiu L, Cohen PR (1999) Clustering time series with hidden Markov models and dynamic time warping. In: Proceedings of the IJCAI-99 workshop on neural, symbolic and reinforcement learning methods for sequence learning, pp 17–21

  19. Kohlmorgen J, Lemm S (2001) A dynamic hmm for on–line segmentation of sequential data. In: Proceedings of the 14th international conference on neural information processing systems: natural and synthetic, ser. NIPS’01, pp 793–800

  20. Fern XZ, Brodley CE (2003) Random projection for high dimensional data clustering: a cluster ensemble approach. In: Fawcett T, Mishra N (eds) ICML, pp 186–193

  21. Pichler K, Lughofer E, Pichler M, Buchegger T, Klement EP, Huschenbett M (2016) Fault detection in reciprocating compressor valves under varying load conditions. Mech Syst Signal Process 70–71:104–119

    Article  Google Scholar 

  22. Fisher M, Huang F, Wright Z, Patton J (2014) Distributions in the error space: goal-directed movements described in time and state-space representations. In: International conference of the IEEE engineering in medicine and biology society, vol 2014, pp 6953–6956. Institute of Electrical and Electronics Engineers Inc.

  23. Montiel J, Read J, Bifet A, Abdessalem T (2018) Scikit-Multiflow: a multi-output streaming framework. CoRR. https://github.com/scikit-multiflow/scikit-multiflow

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jesse Read.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Read, J., Tziortziotis, N. & Vazirgiannis, M. Error-space representations for multi-dimensional data streams with temporal dependence. Pattern Anal Applic 22, 1211–1220 (2019). https://doi.org/10.1007/s10044-018-0739-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-018-0739-7

Keywords

Navigation