Skip to main content
Log in

On rough set based fuzzy clustering for graph data

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Data clustering refers to partition the original data set into some subsets such that every vertex belongs to one or more subsets at the same time. For graph data that composed by attribute information of vertices as well as structural information between vertices, how to make an efficient clustering is not an easy thing. In this paper, we propose a novel method of how to partition graph data into some overlapping subgraph data in aspect of rough set theory. At first, we introduce a detailed description about the global similarity measurement of vertices. After that, an objective-function oriented optimization model is constructed in terms of updating fuzzy membership degree and cluster center that based on the theory of rough set. Obviously, the determined cluster is no longer a fuzzy set, but a rough set, that is to say, the cluster is expressed by the upper approximation set and lower approximation set. Finally, eleven real-world graph data and four synthetic graph data are applied to verify the validity of the proposed fuzzy clustering algorithm. The experimental results show that our algorithm is better than existing clustering approach to some extent.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25

Similar content being viewed by others

Notes

  1. Without loss of generality, in this paper we adhere to the hypothesis that \(e_{jj^{\prime }}\in \{0,1\}\) for all \(j, j^{\prime }=1,2,\ldots ,n\). What is more, it should be pointed out that \(e_{jj^{\prime }}=e_{j^{\prime }j}\), that is to say, \({\mathscr {G}}=({\mathscr {X}},{\mathscr {E}})\) is an undirected graph data.

  2. For a special case that \(m=0\), then FCM turns into the famous k-means clustering algorithm.

  3. In fact, the parameter \(\beta\) is not fixed in advance. In Sect. 5, we will discuss the problem that which value of \(\beta\) is the best for certain data set.

  4. http://www.orgnet.com/cases.html.

  5. http://linqs.cs.umd.edu/projects//projects/lbc/index.html.

  6. http://www-personal.umich.edu/~mejn/netdata.

  7. http://dblp.org.

References

  1. Wen LL, Zhou KL, Yang SL (2019) A shape-based clustering method for pattern recognition of residential electricity consumption. J Clean Prod 212:475–488

    Article  Google Scholar 

  2. Li C, Kulwa F, Zhang JH et al (2021) A review of clustering methods in microorganism image analysis. Inf Technol Biomed 1186:13–25

    Google Scholar 

  3. Kookueva VV, Tsertseil JS (2018) Clustering as a basis for an innovative development strategy. Eur Res Stud J 21:818–830

    Article  Google Scholar 

  4. Pocol CB, Marinescu V, Dabija DC et al (2021) Clustering generation Z university students based on daily fruit and vegetable consumption: empirical research in an emerging market. Br Food J 123:2706–2727

    Article  Google Scholar 

  5. Allahyari M, Pouriyeh S, Assefi M et al (2017) A brief survey of text mining: classification, clustering and extraction techniques. arXiv:1707.02919

  6. Janani R, Vijayarani S (2019) Text document clustering using spectral clustering algorithm with particle swarm optimization. Expert Syst Appl 134:192–200

    Article  Google Scholar 

  7. Steinbach M, Ert\(\ddot{o}\)z L, Kumar V (2004) The challenges of clustering high dimensional data. In: New directions in statistical physics. Springer, Berlin, pp 273–309

  8. Afzali M, Kumar S (2019) Text document clustering: issues and challenges. In: Proceedings of the 2019 international conference on machine learning, big data, cloud and parallel computing, pp 263–268

  9. Bothorel C, Cruz JD, Magnani M et al (2015) Clustering attributed graphs: models, measures and methods. Netw Sci 3:408–444

    Article  Google Scholar 

  10. Li ZT, Liu J, Wu K (2018) A multiobjective evolutionary algorithm based on structural and attribute similarities for community detection in attributed networks. IEEE Trans Cybern 48:1963–1976

    Article  Google Scholar 

  11. Zhou HF, Li J, Li JH et al (2017) A graph clustering method for community detection in complex networks. Phys A 469:551–562

    Article  Google Scholar 

  12. Qin XW, Han XX, Chu JW et al (2021) Density peaks clustering based on Jaccard similarity and label propagation. Cogn Comput 13:1609–1626

    Article  Google Scholar 

  13. Xu J (2011) Graph clustering algorithm based on the degree and the number of vertices. Dalian Maritime University, pp 2–3

  14. Xiong H, Wu JJ, Chen J (2009) K-means clustering versus validation measures: a data-distribution perspective. IEEE Trans Syst Man Cybern Part B (Cybernetics) 39:318–331

    Article  Google Scholar 

  15. Bezdek JC, Ehrlich R, Full W (1984) FCM: the fuzzy c-means clustering algorithm. Comput Geosci 10:191–203

    Article  Google Scholar 

  16. Zhou KL, Yang SL (2016) Exploring the uniform effect of FCM clustering: a data distribution perspective. Knowl Based Syst 96:76–83

    Article  Google Scholar 

  17. Wang C, Pan SR, Hu RQ et al (2019) Attributed graph clustering: a deep attentional embedding approach. arXiv:1906.06532

  18. Fan SH, Wang X, Shi C et al (2020) One2multi graph autoencoder for multi-view graph clustering. In: Proceedings of the web conference, pp 3070–3076

  19. Li XL, Zhang HW, Zhang R et al (2021) Adaptive graph auto-encoder for general data clustering. IEEE Trans Pattern Anal Mach Intell 1–9

  20. Liao RJ, Brockschmidt M, Tarlow D et al (2018) Graph partition neural networks for semi-supervised classification. arXiv:1803.06272

  21. Zhang XT, Liu H, Li QM et al (2019) Attributed graph clustering via adaptive graph convolution. arXiv:1906.01210

  22. Lee C, Wilkinson DJ (2019) A review of stochastic block models and extensions for graph clustering. Appl Netw Sci 4:1–50

    Article  Google Scholar 

  23. Zhou Y, Cheng H, Yu JX (2009) Graph clustering based on structural/attribute similarities. Proc VLDB Endow 2:718–729

    Article  Google Scholar 

  24. Kulis B, Basu S, Dhillon I et al (2009) Semi-supervised graph clustering: a kernel approach. Mach Learn 74:1–22

    Article  Google Scholar 

  25. Li XC, Yin HZ, Zhou K et al (2020) Semi-supervised clustering with deep metric learning and graph embedding. World Wide Web 23:781–798

    Article  Google Scholar 

  26. Zhang Y, Wu B, Liu Y (2017) A novel community detection method based on rough set k-means. J Electron Inf Technol 39:770–777

    Google Scholar 

  27. Gupta S, Kumar P (2020) An overlapping community detection algorithm based on rough clustering of links. Data Knowl Eng 125:101777

    Article  Google Scholar 

  28. Cai YD, Huang JZ, Yin JF (2022) A new method to build the adaptive k-nearest neighbors similarity graph matrix for spectral clustering. Neurocomputing 493:191–203

    Article  Google Scholar 

  29. Callum S, Wang JB, Li YY (2020) Quantum walk inspired algorithm for graph similarity and isomorphism. Quantum Inf Process 19:280–299

    MathSciNet  Google Scholar 

  30. Li JY, Jiang W, Han H et al (2021) ScGSLC: an unsupervised graph similarity learning framework for single-cell RNA-seq data clustering. Comput Biol Chem 90:107415

    Article  Google Scholar 

  31. Fouss F, Pirotte A, Renders JM et al (2007) Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation. IEEE Trans Knowl Data Eng 19:355–369

    Article  Google Scholar 

  32. Cai HY, Zheng VW, Chang KC (2017) A comprehensive survey of graph embedding: problems, techniques, and applications. IEEE Trans Knowl Data Eng 30:1616–1637

    Article  Google Scholar 

  33. Jinarat S, Manaskasemsak B, Rungsawang A (2019) Short text clustering based on word semantic graph with word embedding model. In: Proceedings of the 10th international conference on soft computing and intelligent systems (SCIS), pp 1427–1432

  34. Pawlak Z (1982) Rough sets. Int J Comput Inform Sci 11:341–356

    Article  Google Scholar 

  35. Hashemzadeh M, Oskouei AG, Farajzadeh N (2019) New fuzzy c-means clustering method based on feature-weight and cluster-weight learning. Appl Soft Comput 78:324–345

    Article  Google Scholar 

  36. Knuth DE (1993) The Stanford graphbase: a platform for combinatorial computing. Addison-Wesley, Reading

    MATH  Google Scholar 

  37. Newman MEJ (2006) Finding community structure in networks using the eigenvectors of matrices. Phys Rev E 74:1–22

    Article  MathSciNet  Google Scholar 

  38. Girvan M, Newman MEJ (2002) Community structure in social and biological networks. Proc Natl Acad Sci 7821–7826

  39. White JG, Southgate E, Thomson JN et al (1986) The structure of the nervous system of the nematode Caenorhabditis elegans. Philos Trans R Soc Lond 314:1–340

    Google Scholar 

  40. Prithviraj S, Galileo N, Mustafa B et al (2008) Collective classification of network data. AI Mag 29:93–106

    Google Scholar 

  41. Zhou WP, Lu L (2010) Link prediction based on local random walk. EPL (Europhysics Letters) 89:58007

  42. Mitra S, Banka H, Pedrycz W (2006) Rough-fuzzy collaborative clustering. IEEE Trans Syst Man Cybern Part B (Cybernetics) 36:795–805

    Article  Google Scholar 

  43. Ding Y, Fu X (2016) Kernel-based fuzzy c-means clustering algorithm based on genetic algorithm. Neurocomputing 188:233–238

    Article  Google Scholar 

  44. Zhan H, Chen P, Zhang XF (2020) Overlapping community partition based on rough fuzzy clustering algorithm. Inf Syst Eng 3:89–94

    Google Scholar 

Download references

Acknowledgements

We are hugely grateful to the possible anonymous reviewers for their constructive comments with respect to the original manuscript. What is more, we thank the National Natural Science Foundation of China (Nos. 61966039, 11971065) and the Yunnan Province Education Department Scientific Research Fund Project (No. 2021Y670).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shihu Liu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

He, W., Liu, S., Xu, W. et al. On rough set based fuzzy clustering for graph data. Int. J. Mach. Learn. & Cyber. 13, 3463–3490 (2022). https://doi.org/10.1007/s13042-022-01607-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-022-01607-6

Keywords

Navigation