ABSTRACT
This paper describes a novel method for clustering single and multi-dimensional data streams. With incremental computation of the incoming data, our method determines if the cluster formation should change from an initial cluster formation. Four main types of cluster evolutions are studied: cluster appearance, cluster disappearance, cluster splitting, and cluster merging. We present experimental results of our algorithms both in terms of scalability and cluster quality, compared with recent work in this area.
- Aggarwal, C. C., Han, J., Wang, J., and Yu, P. S., A framework for projected clustering of high dimensional data streams, In Proceedings of the 30th International Conference on Very Large Data Bases (Toronto, Canada). 852--863. Google ScholarDigital Library
- Guha, S., Meyerson, A., Mishra, N., Motwani, R., and O'Callaghan, L., Clustering data streams: Theory and practice IEEE Transactions on Knowledge and Data Engineering, 15, 515--528. Google ScholarDigital Library
- Rodrigues, P. P., Gama, J., and Pedroso, J. P., Hierarchical clustering of time-series data streams IEEE Transactions on Knowledge and Data Engineering, 20, 615--627. Google ScholarDigital Library
- Udommanetanakit, K., Rakthanmanon, T., and Waiyamai, K., E-Stream: evolution-based technique for stream clustering, In Proceedings of the 3rd International Conference on Advanced Data Mining and Applications (Harbin, China, August 6--8, 2007). Springer Berlin / Heidelberg, 605--615. Google ScholarDigital Library
Index Terms
- A method for clustering transient data streams
Recommendations
An evolutionary algorithm for clustering data streams with a variable number of clusters
An evolutionary algorithm for clustering data stream is proposed.Our algorithm allows estimating k automatically from the data in an online fashion.It monitors eventual degradation in the quality of the induced clusters.Results show our algorithm ...
Improving Multivariate Data Streams Clustering
Clustering data streams is an important task in data mining research. Recently, some algorithms have been proposed to cluster data streams as a whole, but just few of them deal with multivariate data streams. Even so, these algorithms merely aggregate ...
Improved k- means clustering algorithm for two dimensional data
CCSEIT '12: Proceedings of the Second International Conference on Computational Science, Engineering and Information TechnologyClustering is a procedure of organizing the objects in groups whose member exhibits some kind of similarity. So a cluster is a collection of objects which are alike and are different from the objects belonging to other clusters. K-Means is one of ...
Comments