skip to main content
10.1145/3373509.3373521acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiccprConference Proceedingsconference-collections
research-article

Clustering Data Stream with Rough Set

Authors Info & Claims
Published:25 March 2020Publication History

ABSTRACT

In this paper, the upper and lower approximations of rough set are introduced to describe the micro-cluster feature in the procedure of clustering uncertain data stream. The proposed algorithm employs presents the micro-cluster timestamp with the time decay and uses agglomerative clustering method to emerge new cluster in the buffer of outliers. Experimental results show that the proposed algorithm can generate natural clusters and outperforms the existing method in term of accuracy.

References

  1. Xu, W., Qin, Z., Hu, H., and Zhao, N. 2011. Mining uncertain data streams using clustering feature decision trees. International Conference on Advanced Data Mining and Applications. ADMA 2011. Lecture Notes in Computer Science. Springer, Berlin, Heidelberg.195--208. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Wan, R., Gao, Y., and Li, C. 2012. Weighted fuzzy-possibilistic c-means over large data sets. International Journal of Data Warehousing and Mining, 8, 4, 82--107. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Han, J., Kamber, M., and Pei, J. 2011. Data Mining: Concepts and Techniques. Waltham: Morgan Kaufmann. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Chen, J., and He, H. 2016. A fast density-based data stream clustering algorithm with cluster centers self-determine d for mixed data, Information Sciences, 345, 271--293. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Hahsler, M., and Bolanos, M. 2016. Clustering Data Streams Based on Shared Density between Micro-Clusters. IEEE Transactions on Knowledge and Data Engineering, 28, 6, 1449--1461. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Xu, J., Wang, G., Li, T., Deng, W., and Gou, G. 2017. Fat node leading tree for data stream clustering with density peaks. Knowledge-Based Systems, 120, 99--117. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Wattanakitrungroj, N., Maneeroj, S., and Lursinsap, C. 2018. BEstream: Batch capturing with elliptic function for one-pass data stream clustering. Data & Knowledge Engineering, 117, 53--70.Google ScholarGoogle ScholarCross RefCross Ref
  8. Zhao, G., Ba, Z., Du, J., Wang, X., Li, Z., Rong, C., and Huang, C. 2015. Resource constrained data stream clustering with concept drifting for processing sensor data. International Journal of Data Warehousing and Mining, 11, 3, 49--67.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Halim, Z., Waqas, M., Baig, A. R., and Rashid, A. 2017. Efficient clustering of large uncertain graphs using neighborhood information. International Journal of Approximate Reasoning, 90, 274--291.Google ScholarGoogle ScholarCross RefCross Ref
  10. Zhou, J., Chen, L., Chen, C. L. P., Wang, Y., and Li, H. 2018. Uncertain data clustering in distributed peer-to-peer networks. IEEE Transactions on Neural Networks and Learning Systems, 29, 6, 2392--2406.Google ScholarGoogle ScholarCross RefCross Ref
  11. Aggarwal, C. C, and Yu, P. S. 2008. A framework for clustering uncertain data streams. IEEE 24th International Conference on Data Engineering. 150--159. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Zhang, T., Ramakrishnan, R., and Livny, M. 1996. BIRCH: an efficient data clustering method for very large databases. In Proceeding of ACM SIGMOD International Conference on Management of Data. 103--114. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Zhang, C., Jin, C., & Zhou, A. 2010. Clustering algorithm over uncertain data streams. Journal of Software. 21, 9, 2173--2182.Google ScholarGoogle Scholar
  14. Pawlak, Z. 1982. Rough sets. International Journal of Computer and Information Sciences. 11, 5, 341--356.Google ScholarGoogle ScholarCross RefCross Ref
  15. Zhou, T., Zhang, Y., Yuan, H., Lu, H. 2007. Rough k-means cluster with adaptive parameters. IEEE International Conference on Machine Learning and Cybernetics. 3063--3068.Google ScholarGoogle ScholarCross RefCross Ref
  16. Yogita &Toshniwal, D. 2012. A novel rough set based clustering approach for streaming data. In Proceedings of the Second International Conference on Soft Computing for Problem Solving. 1253--1265.Google ScholarGoogle Scholar
  17. Pawlak, Z. 1991. Rough Sets: Theoretical Aspects of Reasoning about Data. Kluwer Academic, Boston. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Yao, Y. 2008. Probabilistic rough set approximations. International Journal of Approximate Reasoning. 49, 2, 255--271. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Rodriguez, A., & Laio A. 2014. Clustering by fast search and find of density peaks. Science. 344, 6191, 1492--1496.Google ScholarGoogle Scholar
  20. Aggarwal, C. C., Han, J., Wang, J., & Yu, P.S..2003. A framework for clustering evolving data streams. In Proceedings of the 29th International Conference on very large data bases. Berlin, Germany. 81--92. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Wang, H., & Zhou, M. 2012. A refined rough k-means clustering with hybrid threshold. In: Yao J. et al.(eds) Rough Sets and Current Trends in Computing. Lecture Notes in Computer Science. 7413, 26--35.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Clustering Data Stream with Rough Set

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        ICCPR '19: Proceedings of the 2019 8th International Conference on Computing and Pattern Recognition
        October 2019
        522 pages
        ISBN:9781450376570
        DOI:10.1145/3373509

        Copyright © 2019 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 25 March 2020

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited
      • Article Metrics

        • Downloads (Last 12 months)4
        • Downloads (Last 6 weeks)0

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader