Granular Sketch Based Uncertain Time Series Streams Clustering

Chen, Jingyu; Chen, Ping; Sheng, Xian’gang

doi:10.1007/978-3-642-53932-9_53

Jingyu Chen^4,5,
Ping Chen⁵ &
Xian’gang Sheng⁶

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 391))

Included in the following conference series:

International Conference on Information Computing and Applications

1596 Accesses

Abstract

Uncertainty is inherent in data streams, and presents new challenges to data streams mining. For continuous arriving and huge size of data streams, it requires significantly more space to represent and cluster the uncertain time series data streams. Therefore, it is important to construct compressed representation for storing uncertain time series data. The granular sketches and buckets policy are designed through hash-compressed storage and micro clusters. Then based on the max-min cluster distance measure, an initial cluster centers selection algorithm is proposed to improve the quality of clustering uncertain data streams. Finally, the effectiveness of the proposed algorithm is illustrated through analyzing the experimental results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Partition-Based Clustering with Sliding Windows for Data Streams

A Comparative Study on Data Stream Clustering Algorithms

Enhancement of Data Streaming in Clustering for Uncertain Data

References

Chau, M., Cheng, R., Kao, B., Ng, J.: Uncertain data mining: An example in clustering location data. In: Ng, W.-K., Kitsuregawa, M., Li, J., Chang, K. (eds.) PAKDD 2006. LNCS (LNAI), vol. 3918, pp. 199–204. Springer, Heidelberg (2006)
Chapter Google Scholar
Gaffney, S., Smyth, P.: Trajectory clustering with mixtures of regression models. In: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, CA, USA, pp. 63–72 (August 1999)
Google Scholar
Xiong, Y., Yeung, D.: Mixtures of ARMA models for model-based time series clustering. In: Proceedings of the 2002 IEEE International Conference on Data Mining, Maebashi City, Japan, pp. 717–720 (December 2002)
Google Scholar
Sathe, S., Jeung, H., Aberer, K.: Creating probabilistic databases from imprecise time-series data. In: Proceedings of the 2011 IEEE International Conference on Data Engineering (ICDE), pp. 327–338 (2011)
Google Scholar
Ackermann, M.R., Lammersen, C., Martens, M., Raupach, C., Swierkot, K., Sohler, C.: StreamKM++: A Clustering Algorithm for Data Streams. Journal of Experimental Algorithmics (JEA) 17(1) (July 2012)
Google Scholar
Tran, T.T.L., Peng, P., Li, B.D., Diao, Y., Liu, A.N.: PODS: a new model and processing algorithms for uncertain data streams. In: Proceedings of the 2010 International Conference on Management of Data, Indiana, USA, pp. 159–170 (2010)
Google Scholar
Shao, F., Yu, Z.: Principle and Algorithm of Data Mining. Water conservancy & water electric press of China, Beijing (2003)
Google Scholar
Li, Y., Han, J., Yang, J.: Clustering Moving Objects. In: Proc. of the 10th ACM SIGKDD Int’l. Conf. on Knowledge Discovery and Data Mining (2004)
Google Scholar
Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: an efficient data clustering method for very large databases. In: Proc. of the 1996 ACM SIGMOD Int’l. Conf. on Management of Data (1996)
Google Scholar
Luhr, S., Lazarescu, M.: Incremental clustering on dynamic data streams using connectivity based representative points. Data & Knowledge Engineering, 1–27 (2009)
Google Scholar
Alon, N., Matias, Y., Szegedy, M.: The Space Complexity of Approximating the Frequency Moments. In: ACM Symposium on Theory of Computing, pp. 20–29 (1996)
Google Scholar
Cormode, G., Muthukrishnan, S.: An Improved Data-Stream Summary: The Count-min Sketch and its Applications. Journal of Algorithms 55(1), 58–75 (2005)
Article MathSciNet MATH Google Scholar
Cormode, G., Muthukrishnan, S.: What’s hot and what’s not: Tracking most frequent items dynamically. In: Proceedings of the 22nd ACM Symposium on Principles of Database Systems, pp. 296–306 (2003)
Google Scholar
Manerikar, N., Palpanas, T.: Frequent items in streaming data: An experimental evaluation of the state-of-the-art. Technical Report DISI-08-017, University of Trento (March 2008)
Google Scholar
Aggarwal, C.: A Framework for Clustering Massive-Domain Data Streams. In: IEEE 25th International Conference on Data Engineering (ICDE 2009), pp. 102–113 (2009)
Google Scholar
Liu, Y., Zhang, L., Guan, Y.: Sketch-based Streaming PCA Algorithm for Network-wide Traffic Anomaly Detection. In: Proc. ICDCS (2010)
Google Scholar
Somasundaram, R.S., Nedunchezhian, R.: Evaluation of three Simple Imputation Methods for Enhancing Preprocessing of Data with Missing Values. International Journal of Computer Applications 21(10) (May 2011) 0975–8887
Google Scholar
Stoica, I., Morris, R., Karger, D., Kaashoek, M.F., Balakrishnan, H.: Chord: A scalable peer-to-peer lookup service for internet applications. In: Proceedings of the 2001 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, pp. 149–160. ACM Press (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science and Technology, Xidian University, 710071, Xi’an, China
Jingyu Chen
Software Engineering Institute, Xidian University, 710071, Xi’an, China
Jingyu Chen & Ping Chen
College of Information Engineering, Qingdao University, 266071, Qingdao, Shandong, China
Xian’gang Sheng

Authors

Jingyu Chen
View author publications
You can also search for this author in PubMed Google Scholar
Ping Chen
View author publications
You can also search for this author in PubMed Google Scholar
Xian’gang Sheng
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Shanghai Jiao Tong University, 800 Dongchuan Road, Dianxinqunlou 1-401, 200240, Shanghai, China
Yuhang Yang
School of Electrical and Electronic Engineering, Nanyang Technological University, Nanyang Avenue, 639798, Singapore, Singapore
Maode Ma
College of Science, Hebei United University, 063009, Tangshan, Hebei, China
Baoxiang Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, J., Chen, P., Sheng, X. (2013). Granular Sketch Based Uncertain Time Series Streams Clustering. In: Yang, Y., Ma, M., Liu, B. (eds) Information Computing and Applications. ICICA 2013. Communications in Computer and Information Science, vol 391. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-53932-9_53

Download citation

DOI: https://doi.org/10.1007/978-3-642-53932-9_53
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-53931-2
Online ISBN: 978-3-642-53932-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics