Clustering and Change Detection in Multiple Streaming Time Series

Balzanella, Antonio; Verde, Rosanna

doi:10.1007/978-3-319-03859-9_1

Antonio Balzanella²⁰ &
Rosanna Verde²⁰

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8285))

Included in the following conference series:

International Conference on Algorithms and Architectures for Parallel Processing

1553 Accesses

Abstract

In recent years, Data Stream Mining (DSM) has received a lot of attention due to the increasing number of applicative contexts which generate temporally ordered, fast changing, and potentially infinite data. To deal with such data, learning techniques require to satisfy several computational and storage constraints so that new and specific methods have to be developed. In this paper we introduce a new strategy for dealing with the problem of streaming time series clustering. The method allows to detect a partition of the streams over a user chosen time period and to discover evolutions in proximity relations. We show that it is possible to reach these aims, performing the clustering of temporally non overlapping data batches arriving on-line and then running a suitable clustering algorithm on a dissimilarity matrix updated using the outputs of the local clustering. Through an application on real and simulated data, we will show that this method provides results comparable to algorithms for stored data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aggarwal, C.C., Han, J., Wang, J., Yu, P.S.: A framework for clustering evolving data streams. In: VLDB 2003: Proceedings of the 29th International Conference on Very Large Data Bases, pp. 81–92. VLDB Endowment (2003)
Google Scholar
Aggarwal, C.C.: On biased reservoir sampling in the presence of stream evolution. In: VLDB, San Francisco (2001, 2006)
Google Scholar
Balzanella, A., Irpino, A., Verde, R.: Dimensionality reduction techniques for streaming time series: A new symbolic approach. In: Classification as a Tool for Research. Studies in Classification, Data Analysis, and Knowledge Organization, pp. 381–389. Springer, Heidelberg (2010)
Chapter Google Scholar
Balzanella, A., Lechevallier, Y., Verde, R.: Clustering multiple data streams. New Perspectives in Statistical Modeling and Data Analysis. Springer, Heidelberg (2011)
Google Scholar
Beringer, J., Hullermeier, E.: Online clustering of parallel data streams. Data and Knowledge Engineering 58(2), 180–204 (2006)
Article Google Scholar
Dai, B.-R., Huang, J.-W., Yeh, M.-Y., Chen, M.-S.: Adaptive Clustering for Multiple Evolving Streams. IEEE Transactions On Knowledge And Data Engineering 18(9) (2006)
Google Scholar
Calinski, R.B., Harabasz, J.: A dendrite method for cluster analysis. Communications in Statistics 3, 1–27 (1974)
MathSciNet MATH Google Scholar
Davies, D.L., Bouldin, D.W.: Cluster Separation Measure. IEEE Transactions on Pattern Analysis and Machine Intelligence 1(2), 95–104 (1979)
Google Scholar
De Carvalho, F., Lechevallier, Y., Verde, R.: Clustering methods in symbolic data analysis. In: Classification, Clustering, and Data Mining Applications. Studies in Classification, Data Analysis, and Knowledge Organization, pp. 299–317. Springer, Berlin (2004)
Google Scholar
Diday, E.: La methode des Nuees dynamiques. Revue de Statistique Appliquee 19(2), 19–34 (1971)
MathSciNet Google Scholar
Diday, E., Noirhomme-Fraiture, M.: Symbolic Data Analysis and the SODAS Software. Wiley (2008)
Google Scholar
Flajolet, P., Martin, G.N.: Probabilistic counting. In: SFCS 1983: Proceedings of the 24th Annual Symposium on Foundations of Computer Science, pp. 76–82. IEEE Computer Society, Washington, DC (1983)
Google Scholar
Gama, J., Pinto, C.: Discretization from Data Streams: applications to Histograms and Data Mining. In: Proceedings of the 2006 ACM Symposium on Applied Computing, pp. 662–667 (2006)
Google Scholar
Ganguly, A.R., Gama, J., Omitaomu, O.A., Gaber, M.M., Vatsavai, R.R.: Knowledge discovery from sensor data. CRC Press (2009)
Google Scholar
Greenwald, M., Sanjeev, K.: Space-efficient online computation of quantile summaries. SIGMOD Rec. 30(2), 58–66 (2001)
Article Google Scholar
Guha, S., Harb, B.: Wavelet synopsis for data streams: minimizing non-euclidean error. In: KDD, pp. 88–97 (2005)
Google Scholar
Guha, S., Meyerson, A., Mishra, N., Motwani, R.: Clustering Data Streams: Theory and practice. IEEE Transactions on Knowledge and Data Engineering 15(3), 515–528 (2003)
Article Google Scholar
Kavitha, V., Punithavalli, M.: Clustering Time Series Data Stream - A Literature Survey. International Journal of Computer Science and Information Security, IJCSIS 8(1) (April 2010) ISSN 1947-5500
Google Scholar
Hubert, L., Arabie, P.: Comparing partitions. Journal of Classification, 193–218 (1985)
Google Scholar
Camerra, A., Palpanas, T., Shieh, J., Keogh, E.: iSAX 2.0: Indexing and Mining One Billion Time Series. In: ICDM 2010 (2010)
Google Scholar
Laxman, S., Sastrya, P.S.: A Survey of temporal data mining. SADHANA, Academy Proceedings in Engineering Sciences 31(2), 173–198 (2006)
MathSciNet MATH Google Scholar
Mitsa, T.: Temporal Data Mining. CRC Press (2010) ISBN:9781420089769
Google Scholar
Rodriguess, P.P., Pedroso, J.P.: Hierarchical Clustering of Time Series Data Streams. IEEE Transactions on Knowledge and Data Engineering 20(5) (2008)
Google Scholar
Vitter, J.S.: Random sampling with a reservoir. ACM Trans. Math. Softw. 11, 37–57 (1985)
Article MathSciNet MATH Google Scholar
Yu, P.S., Wang, H., Han, J.: Mining Data Streams. In: Maimon, O., Rokach, L. (eds.) The Data Mining and Knowledge Discovery Handbook 2005. Springer (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Dep. of Political Sciences, Second University of Naples, Italy
Antonio Balzanella & Rosanna Verde

Authors

Antonio Balzanella
View author publications
You can also search for this author in PubMed Google Scholar
Rosanna Verde
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Computer Science, Cracow University of Technology, Warszawska 24, 31-155, Cracow, Poland
Joanna Kołodziej
Dipartimento di Ingegneria, Seconda Universita’ di Napoli, 81031, Aversa, CE, Italy
Beniamino Di Martino
DIMES and ICAR-CNR, c/o Università della Calabria, 87036, Rende, CS, Italy
Domenico Talia
College of Computing and Information Sciences, Rochester Institute of Technology, 14623, Rochester, NY, USA
Kaiqi Xiong

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Balzanella, A., Verde, R. (2013). Clustering and Change Detection in Multiple Streaming Time Series. In: Kołodziej, J., Di Martino, B., Talia, D., Xiong, K. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2013. Lecture Notes in Computer Science, vol 8285. Springer, Cham. https://doi.org/10.1007/978-3-319-03859-9_1

Download citation

DOI: https://doi.org/10.1007/978-3-319-03859-9_1
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-03858-2
Online ISBN: 978-3-319-03859-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics