Skip to main content
Log in

Is the simple assignment enough? Exploring the interpretability for community detection

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

The maximum likelihood estimation is a probabilistic inferencing model of community connectivity in large networks. In general, only the adjacency matrix is utilized to perform community structure parameter inference. Although there are recent examples that combine connectivity and attribute information for community detection, our model is an enhanced overlapping community detection model that combines adjacency spectral embedding with maximum likelihood estimation. This provides the flexibility of complex networks to increase connectivity information through measurements from attribute embedding. The attribute information can be effectively captured and transformed by attribute embedding to encode the combination with structure information. Then, the link strength among communities is designed to adjust the impact of these structural information on community generation based on the contribution of the structure to the clusters, and the node assignment allow for the nature of the real network (overlapping and outliers). Experiments highlight attributed networks in which attributed community detection task provides satisfactory performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. Throughout text, we use the words {{nodes, vertices, objects}, {characteristic, intuition, nature, property}, {circles, cluster, communities}} interchangeably.

  2. http://snap.stanford.edu/data

  3. http://snap.stanford.edu/data/com-Youtube.html.

References

  1. Peng C, Zhang Z, Wong KC et al (2017) A scalable community detection algorithm for large graphs using stochastic block models. Intell Data Anal 21(6):1463–1485

    Article  Google Scholar 

  2. Yang J, McAuley J, Leskovec J (2013) Community detection in networks with node attributes. IEEE 13th international conference on data mining. IEEE, 2013 pp 1151–1156

  3. Zhang F, Zhang Y, Qin L et al (2016) When engagement meets similarity: efficient (k, r)-core computation on social networks[J]. arXiv preprint arXiv:611.03254.

  4. Whang JJ, Gleich DF, Dhillon IS (2016) Overlapping community detection using neighborhood-inflated seed expansion. IEEE Trans Knowl Data Eng 28(5):1272–1284

    Article  Google Scholar 

  5. Yang J, Leskovec J (2013) Overlapping community detection at scale: a nonnegative matrix factorization approach. Proceedings of the sixth ACM international conference on web search and data mining. 2013:587–596

  6. You X, Ma Y, Liu Z (2020) A three-stage algorithm on community detection in social networks. Knowl-Based Syst 187:104822

    Article  Google Scholar 

  7. Pan X, Xu G, Wang B et al (2019) A novel community detection algorithm based on local similarity of clustering coefficient in social networks. IEEE Access 7:121586–121598

    Article  Google Scholar 

  8. Li J, Kumar CA, Mei C et al (2017) Comparison of reduction in formal decision contexts. Int J Approximate Reasoning 80:100–122

    Article  MathSciNet  Google Scholar 

  9. Xia S, Zhang Z, Li W et al (2020) GBNRS: a novel rough set algorithm for fast adaptive attribute reduction in classification. IEEE Trans Knowl Data Eng pp 1–1

  10. Zhang H, Wang CD, Lai JH et al (2019) Community detection using multilayer edge mixture model. Knowl Inform Syst 60(2):757–779

    Article  Google Scholar 

  11. Whang JJ, Hou Y, Gleich DF et al (2018) Non-exhaustive, overlapping clustering. IEEE trans pattern anal mach intell 41(11):2644–2659

    Article  Google Scholar 

  12. Xia S, Peng D, Meng D et al (2020) A fast adaptive k-means with no bounds. IEEE Trans Pattern Anal Mach Intell 1–1

  13. Xia S, Chen B, Wang G et al (2021) mCRF and mRD: two classification methods based on a novel multiclass label noise filtering learning framework. IEEE Trans Neural Networks Learn Syst 1–15

  14. Xia S, Liu Y, Ding X et al (2019) Granular ball computing classifiers for efficient, scalable and robust learning. Inform Sci 483:136–152

    Article  MathSciNet  Google Scholar 

  15. Liu Z, Xiang B, Guo W et al (2019) Overlapping community detection algorithm based on coarsening and local overlapping modularity. IEEE Access 7:57943–57955

    Article  Google Scholar 

  16. Zhe C, Sun A, Xiao X (2019) Community detection on large complex attribute network. Proceedings of the 25th ACM SIGKDD int conf knowl discov & data min 2041–2049

  17. Krogan NJ, Cagney G, Yu H et al (2006) Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature 440(7084):637–643

    Article  Google Scholar 

  18. Whang JJ, Dhillon IS, Gleich DF (2015) Non-exhaustive, overlapping k-means. Proceedings of the 2015 SIAM international conference on data mining Soc Ind and Appl Math 936–944

  19. Rodríguez CE, Núr̃ez-Antonio G, Escarela G (2020) A Bayesian mixture model for clustering circular data. Comput Statistics Data Anal, 143:106842

    Article  MathSciNet  Google Scholar 

  20. Guo X, Su J, Zhou H et al (2019) Community detection based on genetic algorithm using local structural similarity. IEEE Access 7:134583–134600

    Article  Google Scholar 

  21. Ma X, Yang P, Guan S (2019) Overlapping community detection algorithm based on edge strength. IEEE Access 7:126642–126650

    Article  Google Scholar 

  22. Bertsimas D, Nohadani O (2019) Robust maximum likelihood estimation. INFORMS J Comput 31(3):445–458

    Article  MathSciNet  Google Scholar 

  23. Sussman DL, Tang M, Fishkind DE et al (2012) A consistent adjacency spectral embedding for stochastic blockmodel graphs. J Am Stat Assoc 107(499):1119–1128

    Article  MathSciNet  Google Scholar 

  24. Perozzi B, Akoglu L (2016) Scalable anomaly ranking of attributed neighborhoods. Proceedings of the 2016 SIAM International Conference on Data Mining. Soc Ind Appl Math 207–215

  25. Zhao Q, Ma H, Li X et al (2020) NotMle: community detection in an inference way. 2020 IEEE 32nd International Conference on Tools with Artificial Intelligence (ICTAI). IEEE 736–741

  26. Newman MEJ (2012) Communities, modules and large-scale structure in networks. Nature phys 8(1):25–31

    Article  Google Scholar 

  27. Mehta N, Duke L C, Rai P (2019) Stochastic blockmodels meet graph neural networks. Int Conf Mach Learn. PMLR, 4466–4474

  28. Qiao M, Yu J, Bian W et al (2018) Adapting stochastic block models to power-law degree distributions. IEEE trans cybern, 49(2):626–637

    Article  Google Scholar 

  29. Airoldi E M, Blei DM, Fienberg SE et al (2008) Mixed membership stochastic blockmodels. J mach learn res 9(2008):1981–2014

  30. Cai X, Huang D, Wang CD et al (2020) Spectral clustering by subspace randomization and graph fusion for high-dimensional data. Adv Knowl Discov Data Min 12084:330–342

  31. Zachary WW (1977) An information flow model for conflict and fission in small groups. J anthropol res 33(4):452–473

    Article  Google Scholar 

  32. You X, Ma Y, Liu Z (2020) A three-stage algorithm on community detection in social networks. Knowl-Based Syst 187:104822

  33. Newman MEJ, Clauset A (2016) Structure and inference in annotated networks. Nature communications 7(1):1–11

    Article  Google Scholar 

  34. Arthur D, Vassilvitskii S (2007) k-means++: the advantages of careful seeding. Soc Appl Math 1027–1035

  35. Frénay B, Verleysen M (2013) Classification in the presence of label noise: a survey. IEEE trans neural networks learn syst 25(5):845–869

    Article  Google Scholar 

  36. Xia S, Wang G, Chen Z et al (2019) Complete random forest based class noise filtering learning for improving the generalizability of classifiers. IEEE Trans Knowl Data Eng 31(11):2063–2078

    Article  Google Scholar 

  37. Ruan Y, Fuhry D, Parthasarathy S (2013) Efficient community detection in large networks using content and links. Proceedings of the 22nd int conf World Wide Web 1089–1098

  38. Ahn YY, Bagrow JP, Lehmann S (2010) Link communities reveal multiscale complexity in networks. Nature 466(7307):761–764

    Article  Google Scholar 

  39. Arasteh M, Alizadeh S (2019) A fast divisive community detection algorithm based on edge degree betweenness centrality. Appl Intelli 49(2):689–702

    Article  Google Scholar 

  40. Lancichinetti A, Fortunato S (2009) Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities. Phys Rev E 80(1):016118

    Article  Google Scholar 

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (61762078, 61363058, 61966004), Research Fund of Guangxi Key Lab of Multi-source Information Mining and Security (MIMS18-08), Northwest Normal University young teachers research capacity promotion plan (NWNU-LKQN2019-2), Research Fund of Guangxi Key Laboratory of Trusted Software (kx202003), and Gansu Innovation college fund project (2020B-089).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Huifang Ma.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest regarding the publication of this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhao, Q., Ma, H., Li, X. et al. Is the simple assignment enough? Exploring the interpretability for community detection. Int. J. Mach. Learn. & Cyber. 12, 3463–3474 (2021). https://doi.org/10.1007/s13042-021-01384-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-021-01384-8

Keywords

Navigation