Abstract
One target on this thesis is to study and realize a kind of data stream clustering algorithm with quick running rate and high clustering accuracy. In order to reach this, we have done some work as follows. Background and relevant work on data stream mining is discussed. Popular traditional clustering algorithms are summarized and the data stream clustering algorithms are researched. On the basis of these, we propose GD-Stream (Grid-Density based Evolving Stream) algorithm, which is a framework based on grid-density. By modifying the synopsis data structure, This algorithm has the following characteristics. Borrowing the framework from CluStream algorithm, GD-Stream is divided into online layer and offline layer, using density-decaying skill Online layer reads data stream rapidly, and stores relative information by synopsis data structure. With this, offline layer provide accurate clustering. The two layers work together to achieve the balance of accuracy and speed..
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Zhang, X., Zeng, W.: Research and advances of real-time data stream clustering. Computer Engineering and Design 30(9), 2177–2186 (2009)
Tu, L., Chen, L.: Stream data clustering based on grid density and attraction. ACM Transactions on Knowledge Discovery from Data (TKDD), 12–39 (2009)
Zheng, Y., Ni, Z., Wu, S., et al.: Data stream cluster algorithm based on mobile grid and density. Computer Engineering and Applications 45(8), 129–131 (2009)
Mahdiraji, A.R.: Clustering data stream: A survey of algorithms. International Journal of Knowledge-based and Intelligent Engineering Systems, 39–44 (2009)
Guha, S., Rastogi, R., Shim, K.: CURE:An efficient clustering algorithm for large database. Information Systems 26(1), 35P–58P (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhishui, Z. (2011). A Kind of Data Stream Clustering Algorithm Based on Grid-Density. In: Lin, S., Huang, X. (eds) Advances in Computer Science, Environment, Ecoinformatics, and Education. CSEE 2011. Communications in Computer and Information Science, vol 215. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23324-1_67
Download citation
DOI: https://doi.org/10.1007/978-3-642-23324-1_67
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23323-4
Online ISBN: 978-3-642-23324-1
eBook Packages: Computer ScienceComputer Science (R0)