Skip to main content

Clustering and Change Detection in Multiple Streaming Time Series

  • Conference paper
Algorithms and Architectures for Parallel Processing (ICA3PP 2013)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8285))

  • 1553 Accesses

Abstract

In recent years, Data Stream Mining (DSM) has received a lot of attention due to the increasing number of applicative contexts which generate temporally ordered, fast changing, and potentially infinite data. To deal with such data, learning techniques require to satisfy several computational and storage constraints so that new and specific methods have to be developed. In this paper we introduce a new strategy for dealing with the problem of streaming time series clustering. The method allows to detect a partition of the streams over a user chosen time period and to discover evolutions in proximity relations. We show that it is possible to reach these aims, performing the clustering of temporally non overlapping data batches arriving on-line and then running a suitable clustering algorithm on a dissimilarity matrix updated using the outputs of the local clustering. Through an application on real and simulated data, we will show that this method provides results comparable to algorithms for stored data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aggarwal, C.C., Han, J., Wang, J., Yu, P.S.: A framework for clustering evolving data streams. In: VLDB 2003: Proceedings of the 29th International Conference on Very Large Data Bases, pp. 81–92. VLDB Endowment (2003)

    Google Scholar 

  2. Aggarwal, C.C.: On biased reservoir sampling in the presence of stream evolution. In: VLDB, San Francisco (2001, 2006)

    Google Scholar 

  3. Balzanella, A., Irpino, A., Verde, R.: Dimensionality reduction techniques for streaming time series: A new symbolic approach. In: Classification as a Tool for Research. Studies in Classification, Data Analysis, and Knowledge Organization, pp. 381–389. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  4. Balzanella, A., Lechevallier, Y., Verde, R.: Clustering multiple data streams. New Perspectives in Statistical Modeling and Data Analysis. Springer, Heidelberg (2011)

    Google Scholar 

  5. Beringer, J., Hullermeier, E.: Online clustering of parallel data streams. Data and Knowledge Engineering 58(2), 180–204 (2006)

    Article  Google Scholar 

  6. Dai, B.-R., Huang, J.-W., Yeh, M.-Y., Chen, M.-S.: Adaptive Clustering for Multiple Evolving Streams. IEEE Transactions On Knowledge And Data Engineering 18(9) (2006)

    Google Scholar 

  7. Calinski, R.B., Harabasz, J.: A dendrite method for cluster analysis. Communications in Statistics 3, 1–27 (1974)

    MathSciNet  MATH  Google Scholar 

  8. Davies, D.L., Bouldin, D.W.: Cluster Separation Measure. IEEE Transactions on Pattern Analysis and Machine Intelligence 1(2), 95–104 (1979)

    Google Scholar 

  9. De Carvalho, F., Lechevallier, Y., Verde, R.: Clustering methods in symbolic data analysis. In: Classification, Clustering, and Data Mining Applications. Studies in Classification, Data Analysis, and Knowledge Organization, pp. 299–317. Springer, Berlin (2004)

    Google Scholar 

  10. Diday, E.: La methode des Nuees dynamiques. Revue de Statistique Appliquee 19(2), 19–34 (1971)

    MathSciNet  Google Scholar 

  11. Diday, E., Noirhomme-Fraiture, M.: Symbolic Data Analysis and the SODAS Software. Wiley (2008)

    Google Scholar 

  12. Flajolet, P., Martin, G.N.: Probabilistic counting. In: SFCS 1983: Proceedings of the 24th Annual Symposium on Foundations of Computer Science, pp. 76–82. IEEE Computer Society, Washington, DC (1983)

    Google Scholar 

  13. Gama, J., Pinto, C.: Discretization from Data Streams: applications to Histograms and Data Mining. In: Proceedings of the 2006 ACM Symposium on Applied Computing, pp. 662–667 (2006)

    Google Scholar 

  14. Ganguly, A.R., Gama, J., Omitaomu, O.A., Gaber, M.M., Vatsavai, R.R.: Knowledge discovery from sensor data. CRC Press (2009)

    Google Scholar 

  15. Greenwald, M., Sanjeev, K.: Space-efficient online computation of quantile summaries. SIGMOD Rec. 30(2), 58–66 (2001)

    Article  Google Scholar 

  16. Guha, S., Harb, B.: Wavelet synopsis for data streams: minimizing non-euclidean error. In: KDD, pp. 88–97 (2005)

    Google Scholar 

  17. Guha, S., Meyerson, A., Mishra, N., Motwani, R.: Clustering Data Streams: Theory and practice. IEEE Transactions on Knowledge and Data Engineering 15(3), 515–528 (2003)

    Article  Google Scholar 

  18. Kavitha, V., Punithavalli, M.: Clustering Time Series Data Stream - A Literature Survey. International Journal of Computer Science and Information Security, IJCSIS 8(1) (April 2010) ISSN 1947-5500

    Google Scholar 

  19. Hubert, L., Arabie, P.: Comparing partitions. Journal of Classification, 193–218 (1985)

    Google Scholar 

  20. Camerra, A., Palpanas, T., Shieh, J., Keogh, E.: iSAX 2.0: Indexing and Mining One Billion Time Series. In: ICDM 2010 (2010)

    Google Scholar 

  21. Laxman, S., Sastrya, P.S.: A Survey of temporal data mining. SADHANA, Academy Proceedings in Engineering Sciences 31(2), 173–198 (2006)

    MathSciNet  MATH  Google Scholar 

  22. Mitsa, T.: Temporal Data Mining. CRC Press (2010) ISBN:9781420089769

    Google Scholar 

  23. Rodriguess, P.P., Pedroso, J.P.: Hierarchical Clustering of Time Series Data Streams. IEEE Transactions on Knowledge and Data Engineering 20(5) (2008)

    Google Scholar 

  24. Vitter, J.S.: Random sampling with a reservoir. ACM Trans. Math. Softw. 11, 37–57 (1985)

    Article  MathSciNet  MATH  Google Scholar 

  25. Yu, P.S., Wang, H., Han, J.: Mining Data Streams. In: Maimon, O., Rokach, L. (eds.) The Data Mining and Knowledge Discovery Handbook 2005. Springer (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer International Publishing Switzerland

About this paper

Cite this paper

Balzanella, A., Verde, R. (2013). Clustering and Change Detection in Multiple Streaming Time Series. In: Kołodziej, J., Di Martino, B., Talia, D., Xiong, K. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2013. Lecture Notes in Computer Science, vol 8285. Springer, Cham. https://doi.org/10.1007/978-3-319-03859-9_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-03859-9_1

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-03858-2

  • Online ISBN: 978-3-319-03859-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics