ABSTRACT
In this paper, the upper and lower approximations of rough set are introduced to describe the micro-cluster feature in the procedure of clustering uncertain data stream. The proposed algorithm employs presents the micro-cluster timestamp with the time decay and uses agglomerative clustering method to emerge new cluster in the buffer of outliers. Experimental results show that the proposed algorithm can generate natural clusters and outperforms the existing method in term of accuracy.
- Xu, W., Qin, Z., Hu, H., and Zhao, N. 2011. Mining uncertain data streams using clustering feature decision trees. International Conference on Advanced Data Mining and Applications. ADMA 2011. Lecture Notes in Computer Science. Springer, Berlin, Heidelberg.195--208. Google ScholarDigital Library
- Wan, R., Gao, Y., and Li, C. 2012. Weighted fuzzy-possibilistic c-means over large data sets. International Journal of Data Warehousing and Mining, 8, 4, 82--107. Google ScholarDigital Library
- Han, J., Kamber, M., and Pei, J. 2011. Data Mining: Concepts and Techniques. Waltham: Morgan Kaufmann. Google ScholarDigital Library
- Chen, J., and He, H. 2016. A fast density-based data stream clustering algorithm with cluster centers self-determine d for mixed data, Information Sciences, 345, 271--293. Google ScholarDigital Library
- Hahsler, M., and Bolanos, M. 2016. Clustering Data Streams Based on Shared Density between Micro-Clusters. IEEE Transactions on Knowledge and Data Engineering, 28, 6, 1449--1461. Google ScholarDigital Library
- Xu, J., Wang, G., Li, T., Deng, W., and Gou, G. 2017. Fat node leading tree for data stream clustering with density peaks. Knowledge-Based Systems, 120, 99--117. Google ScholarDigital Library
- Wattanakitrungroj, N., Maneeroj, S., and Lursinsap, C. 2018. BEstream: Batch capturing with elliptic function for one-pass data stream clustering. Data & Knowledge Engineering, 117, 53--70.Google ScholarCross Ref
- Zhao, G., Ba, Z., Du, J., Wang, X., Li, Z., Rong, C., and Huang, C. 2015. Resource constrained data stream clustering with concept drifting for processing sensor data. International Journal of Data Warehousing and Mining, 11, 3, 49--67.Google ScholarDigital Library
- Halim, Z., Waqas, M., Baig, A. R., and Rashid, A. 2017. Efficient clustering of large uncertain graphs using neighborhood information. International Journal of Approximate Reasoning, 90, 274--291.Google ScholarCross Ref
- Zhou, J., Chen, L., Chen, C. L. P., Wang, Y., and Li, H. 2018. Uncertain data clustering in distributed peer-to-peer networks. IEEE Transactions on Neural Networks and Learning Systems, 29, 6, 2392--2406.Google ScholarCross Ref
- Aggarwal, C. C, and Yu, P. S. 2008. A framework for clustering uncertain data streams. IEEE 24th International Conference on Data Engineering. 150--159. Google ScholarDigital Library
- Zhang, T., Ramakrishnan, R., and Livny, M. 1996. BIRCH: an efficient data clustering method for very large databases. In Proceeding of ACM SIGMOD International Conference on Management of Data. 103--114. Google ScholarDigital Library
- Zhang, C., Jin, C., & Zhou, A. 2010. Clustering algorithm over uncertain data streams. Journal of Software. 21, 9, 2173--2182.Google Scholar
- Pawlak, Z. 1982. Rough sets. International Journal of Computer and Information Sciences. 11, 5, 341--356.Google ScholarCross Ref
- Zhou, T., Zhang, Y., Yuan, H., Lu, H. 2007. Rough k-means cluster with adaptive parameters. IEEE International Conference on Machine Learning and Cybernetics. 3063--3068.Google ScholarCross Ref
- Yogita &Toshniwal, D. 2012. A novel rough set based clustering approach for streaming data. In Proceedings of the Second International Conference on Soft Computing for Problem Solving. 1253--1265.Google Scholar
- Pawlak, Z. 1991. Rough Sets: Theoretical Aspects of Reasoning about Data. Kluwer Academic, Boston. Google ScholarDigital Library
- Yao, Y. 2008. Probabilistic rough set approximations. International Journal of Approximate Reasoning. 49, 2, 255--271. Google ScholarDigital Library
- Rodriguez, A., & Laio A. 2014. Clustering by fast search and find of density peaks. Science. 344, 6191, 1492--1496.Google Scholar
- Aggarwal, C. C., Han, J., Wang, J., & Yu, P.S..2003. A framework for clustering evolving data streams. In Proceedings of the 29th International Conference on very large data bases. Berlin, Germany. 81--92. Google ScholarDigital Library
- Wang, H., & Zhou, M. 2012. A refined rough k-means clustering with hybrid threshold. In: Yao J. et al.(eds) Rough Sets and Current Trends in Computing. Lecture Notes in Computer Science. 7413, 26--35.Google ScholarCross Ref
Index Terms
- Clustering Data Stream with Rough Set
Recommendations
Variable-precision dominance-based rough set approach and attribute reduction
In this paper, a variable-precision dominance-based rough set approach (VP-DRSA) is proposed together with several VP-DRSA-based approaches to attribute reduction. The properties of VP-DRSA are shown in comparison to previous dominance-based rough set ...
Multi-granulation rough sets based on tolerance relations
The original rough set model is primarily concerned with the approximations of sets described by a single equivalence relation on the universe. Some further investigations generalize the classical rough set model to rough set model based on a tolerance ...
An extension to rough c-means clustering algorithm based on boundary area elements discrimination
Transactions on Rough Sets XVIRough c-means algorithm has gained increasing attention in recent years. However, the original Rough c-means algorithm does not distinguish data points in the boundary area while computing the new centroid of each cluster. In this paper, we consider the ...
Comments