Abstract
In many application scenarios, data points are not only temporally dependent, but also expected in the form of a fast-moving stream. A broad selection of efficient learning algorithms exists which may be applied to data streams, but they typically do not take into account the temporal nature of the data. We motivate and design a method which creates an efficient representation of a data stream, where temporal information is embedded into each instance via the error space of forecasting models. Unlike many other methods in the literature, our approach can be rapidly initialized and does not require iterations over the full data sequence, thus it is suitable for a streaming scenario. This allows the application of off-the-shelf data-stream methods, depending on the application domain. In this paper, we investigate classification. We compare to a large variety of methods (auto-encoders, HMMs, basis functions, clustering methodologies, and PCA) and find that our proposed methods perform very competitively, and offers much promise for future work.
Notes
i.e. \(x_t := {\mathbf {x}}_t\) here.
In relation to the window \(\tau \) considered by our method, then \(W = \tau + 1\).
The distinction between clustering and latent variables is of course not always necessary.
Our code has been made available in the Scikit-MultiFlow framework [23].
References
Matsubara Y, Sakurai Y, Faloutsos C (2014) Autoplait: automatic mining of co-evolving time sequences. In: Proceedings of the 2014 ACM SIGMOD international conference on management of data, ser. SIGMOD ’14, pp 193–204. ACM, New York, NY, USA. https://doi.org/10.1145/2588555.2588556
Barber D (2012) Bayesian reasoning and machine learning. Cambridge University Press, Cambridge
Sutton RS, Barto AG (1998) Introduction to reinforcement learning. MIT Press, Cambridge
Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828. https://doi.org/10.1109/TPAMI.2013.50
Bifet A, Holmes G, Pfahringer B, Gavaldà R (2009) Improving adaptive bagging methods for evolving data streams. In: Asian conference on machine learning
Gama J (2010) Knowledge discovery from data streams. Chapman & Hall/CRC, Boca Raton
Bifet A, Holmes G, Kirkby R, Pfahringer B (2010) MOA: massive online analysis. J Mach Learn Res (JMLR) 11:1601–1604
Žliobaitė I, Bifet A, Read J, Pfahringer B, Holmes G (2014) Evaluation methods and decision theory for classification of streaming data with temporal dependence. Mach Learn 98(3):455–482
Hollmén J, Tresp V (1999) Call-based fraud detection in mobile communications networks using a hierarchical regime-switching model. In: Proceedings of the 1998 conference advances in neural information processing systems II (NIPS’11), pp 889–895
Zafeiriou L, Nicolaou MA, Zafeiriou S, Nikitidis S, Pantic M (2016) Probabilistic slow features for behavior analysis. IEEE Trans Neural Netw Learn Syst 27(5):1034–1048. https://doi.org/10.1109/TNNLS.2015.2435653
Lughofer E, Weigl E, Heidl W, Eitzinger C, Radauer T (2016) Recognizing input space and target concept drifts in data streams with scarcely labeled and unlabelled instances. Inf Sci 355(C):127–151. https://doi.org/10.1016/j.ins.2016.03.034
Gama J, Žliobaitė I, Bifet A, Pechenizkiy M, Bouchachia A (2014) A survey on concept drift adaptation. ACM Comput Surv 46(4):44:1–44:37. https://doi.org/10.1145/2523813
Tilo S (2016) Data fitting and uncertainty. Springer, Berlin
Gomes HM, Bifet A, Read J, Barddal JP, Enembreck F, Pfharinger B, Holmes G, Abdessalem T (2017) Adaptive random forests for evolving data stream classification. Mach Learn 106(9–10):1469–1495. https://doi.org/10.1007/s10994-017-5642-8
Romeu P, Zamora-Martínez F, Botella-Rocamora P, Pardo J (2015) Stacked denoising auto-encoders for short-term time series forecasting. Springer, Cham, pp 463–486. https://doi.org/10.1007/978-3-319-09903-3_23
Hoffman MD, Blei DM, Wang C, Paisley J (2013) Stochastic variational inference. J Mach Learn Res (JMLR) 14:1303–1347
Read J, Perez-Cruz F, Bifet A (2015) Deep learning in multi-label data-streams. In: SAC 2015: 30th ACM symposium on applied computing. ACM
Oates T, Firoiu L, Cohen PR (1999) Clustering time series with hidden Markov models and dynamic time warping. In: Proceedings of the IJCAI-99 workshop on neural, symbolic and reinforcement learning methods for sequence learning, pp 17–21
Kohlmorgen J, Lemm S (2001) A dynamic hmm for on–line segmentation of sequential data. In: Proceedings of the 14th international conference on neural information processing systems: natural and synthetic, ser. NIPS’01, pp 793–800
Fern XZ, Brodley CE (2003) Random projection for high dimensional data clustering: a cluster ensemble approach. In: Fawcett T, Mishra N (eds) ICML, pp 186–193
Pichler K, Lughofer E, Pichler M, Buchegger T, Klement EP, Huschenbett M (2016) Fault detection in reciprocating compressor valves under varying load conditions. Mech Syst Signal Process 70–71:104–119
Fisher M, Huang F, Wright Z, Patton J (2014) Distributions in the error space: goal-directed movements described in time and state-space representations. In: International conference of the IEEE engineering in medicine and biology society, vol 2014, pp 6953–6956. Institute of Electrical and Electronics Engineers Inc.
Montiel J, Read J, Bifet A, Abdessalem T (2018) Scikit-Multiflow: a multi-output streaming framework. CoRR. https://github.com/scikit-multiflow/scikit-multiflow
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Read, J., Tziortziotis, N. & Vazirgiannis, M. Error-space representations for multi-dimensional data streams with temporal dependence. Pattern Anal Applic 22, 1211–1220 (2019). https://doi.org/10.1007/s10044-018-0739-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10044-018-0739-7