Skip to main content

Granular Sketch Based Uncertain Time Series Streams Clustering

  • Conference paper
  • 1562 Accesses

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 391))

Abstract

Uncertainty is inherent in data streams, and presents new challenges to data streams mining. For continuous arriving and huge size of data streams, it requires significantly more space to represent and cluster the uncertain time series data streams. Therefore, it is important to construct compressed representation for storing uncertain time series data. The granular sketches and buckets policy are designed through hash-compressed storage and micro clusters. Then based on the max-min cluster distance measure, an initial cluster centers selection algorithm is proposed to improve the quality of clustering uncertain data streams. Finally, the effectiveness of the proposed algorithm is illustrated through analyzing the experimental results.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Chau, M., Cheng, R., Kao, B., Ng, J.: Uncertain data mining: An example in clustering location data. In: Ng, W.-K., Kitsuregawa, M., Li, J., Chang, K. (eds.) PAKDD 2006. LNCS (LNAI), vol. 3918, pp. 199–204. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  2. Gaffney, S., Smyth, P.: Trajectory clustering with mixtures of regression models. In: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, CA, USA, pp. 63–72 (August 1999)

    Google Scholar 

  3. Xiong, Y., Yeung, D.: Mixtures of ARMA models for model-based time series clustering. In: Proceedings of the 2002 IEEE International Conference on Data Mining, Maebashi City, Japan, pp. 717–720 (December 2002)

    Google Scholar 

  4. Sathe, S., Jeung, H., Aberer, K.: Creating probabilistic databases from imprecise time-series data. In: Proceedings of the 2011 IEEE International Conference on Data Engineering (ICDE), pp. 327–338 (2011)

    Google Scholar 

  5. Ackermann, M.R., Lammersen, C., Martens, M., Raupach, C., Swierkot, K., Sohler, C.: StreamKM++: A Clustering Algorithm for Data Streams. Journal of Experimental Algorithmics (JEA) 17(1) (July 2012)

    Google Scholar 

  6. Tran, T.T.L., Peng, P., Li, B.D., Diao, Y., Liu, A.N.: PODS: a new model and processing algorithms for uncertain data streams. In: Proceedings of the 2010 International Conference on Management of Data, Indiana, USA, pp. 159–170 (2010)

    Google Scholar 

  7. Shao, F., Yu, Z.: Principle and Algorithm of Data Mining. Water conservancy & water electric press of China, Beijing (2003)

    Google Scholar 

  8. Li, Y., Han, J., Yang, J.: Clustering Moving Objects. In: Proc. of the 10th ACM SIGKDD Int’l. Conf. on Knowledge Discovery and Data Mining (2004)

    Google Scholar 

  9. Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: an efficient data clustering method for very large databases. In: Proc. of the 1996 ACM SIGMOD Int’l. Conf. on Management of Data (1996)

    Google Scholar 

  10. Luhr, S., Lazarescu, M.: Incremental clustering on dynamic data streams using connectivity based representative points. Data & Knowledge Engineering, 1–27 (2009)

    Google Scholar 

  11. Alon, N., Matias, Y., Szegedy, M.: The Space Complexity of Approximating the Frequency Moments. In: ACM Symposium on Theory of Computing, pp. 20–29 (1996)

    Google Scholar 

  12. Cormode, G., Muthukrishnan, S.: An Improved Data-Stream Summary: The Count-min Sketch and its Applications. Journal of Algorithms 55(1), 58–75 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  13. Cormode, G., Muthukrishnan, S.: What’s hot and what’s not: Tracking most frequent items dynamically. In: Proceedings of the 22nd ACM Symposium on Principles of Database Systems, pp. 296–306 (2003)

    Google Scholar 

  14. Manerikar, N., Palpanas, T.: Frequent items in streaming data: An experimental evaluation of the state-of-the-art. Technical Report DISI-08-017, University of Trento (March 2008)

    Google Scholar 

  15. Aggarwal, C.: A Framework for Clustering Massive-Domain Data Streams. In: IEEE 25th International Conference on Data Engineering (ICDE 2009), pp. 102–113 (2009)

    Google Scholar 

  16. Liu, Y., Zhang, L., Guan, Y.: Sketch-based Streaming PCA Algorithm for Network-wide Traffic Anomaly Detection. In: Proc. ICDCS (2010)

    Google Scholar 

  17. Somasundaram, R.S., Nedunchezhian, R.: Evaluation of three Simple Imputation Methods for Enhancing Preprocessing of Data with Missing Values. International Journal of Computer Applications 21(10) (May 2011) 0975–8887

    Google Scholar 

  18. Stoica, I., Morris, R., Karger, D., Kaashoek, M.F., Balakrishnan, H.: Chord: A scalable peer-to-peer lookup service for internet applications. In: Proceedings of the 2001 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, pp. 149–160. ACM Press (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chen, J., Chen, P., Sheng, X. (2013). Granular Sketch Based Uncertain Time Series Streams Clustering. In: Yang, Y., Ma, M., Liu, B. (eds) Information Computing and Applications. ICICA 2013. Communications in Computer and Information Science, vol 391. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-53932-9_53

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-53932-9_53

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-53931-2

  • Online ISBN: 978-3-642-53932-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics