Skip to main content
Log in

Attributed network community detection based on network embedding and parameter-free clustering

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

In recent years, many attributednetwork have emerged, such as Facebook networks in social networks, protein networks and academic citation networks. In order to find communities where the nodes are tightly connected and have attributes similar to each other by unsupervised learning and improve the accuracy of community detection to make better analysis of the attributed networks, we propose a two-stage attributed network community detection combined with network embedding and parameter-free clustering. In the first stage, we build an attributed network embedding framework that integrates common neighbor information and node attributes. We define node similarity in terms of local link information, jointly model it with attribute proximity, and then adopt the distributed algorithm to obtain the embedding vector of each node. In the second stage, the number of communities can be decided automatically based on curvature and modularity, and the community detection results can be obtained by clustering the embeddings. The performance experiments of our method compared with some representative approaches are tested on real network datasets. The experimental results validate the effectiveness and superiority of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. http://dmml.asu.edu/users/xufei/datasets.html

  2. https://linqs.soe.ucsc.edu/data

  3. http://dmml.asu.edu/users/xufei/datasets.html

  4. https://linqs.soe.ucsc.edu/data

  5. https://github.com/DreamerDW/FeatWalk_AAAI19

  6. https://github.com/authorxxl/CDANE_matlab

  7. https://github.com/phanein/deepwalk

  8. https://github.com/tangjianpku/LINE

  9. http://www.public.asu.edu/~jundongl/code/NetFS.zip

  10. https://github.com/xhuang31/AANE_MATLAB

  11. https://github.com/cshaowang/gmc

  12. https://www.mathworks.com/matlabcentral/fileexchange/10161-mean-shift-clustering

  13. https://www.mathworks.com/matlabcentral/fileexchange/52905-dbscan-clustering-algorithm

References

  1. Balasubramanyan R, Cohen WW (2011) Block-lda: Jointly modeling entity-annotated text and entity-entity links. In: Proceedings of the 2011 SIAM international conference on data mining. SIAM, pp 450–461

  2. Blondel VD, Guillaume JL, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mechan Theory Exp 2008(10):P10008

    Article  Google Scholar 

  3. Caliński T, Harabasz J (1974) A dendrite method for cluster analysis. Commun Stat-Theor Methods 3(1):1–27

    Article  MathSciNet  Google Scholar 

  4. Comaniciu D, Meer P (2002) Mean shift: a robust approach toward feature space analysis. IEEE Trans Pattern Anal Machine Intell 24(5):603–619

    Article  Google Scholar 

  5. Ester M, Kriegel HP, Sander J, Xu X, et al. (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Kdd, vol 96, pp 226–231

  6. Huang X, Li J, Hu X (2017) Accelerated attributed network embedding. In: Proceedings of the 2017 SIAM international conference on data mining. SIAM, pp 633–641

  7. Huang X, Li J, Hu X (2017) Label informed attributed network embedding. In: Proceedings of the Tenth ACM international conference on web search and data mining, pp 731–739

  8. Huang X, Li J, Zou N, Hu X (2018) A general embedding framework for heterogeneous information learning in large-scale networks. ACM Trans Knowl Discov Data (TKDD) 12(6):1–24

    Google Scholar 

  9. Huang X, Song Q, Yang F, Hu X (2019) Large-scale heterogeneous feature embedding. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 3878–3885

  10. Huang Z, Zhong X, Wang Q, Gong M, Ma X (2020) Detecting community in attributed networks by dynamically exploring node attributes and topological structure. Knowledge-based Systems 105760

  11. Krzanowski WJ, Lai Y (1988) A criterion for determining the number of groups in a data set using sum-of-squares clustering. Biometrics 23–34

  12. Kumpula JM, Kivelä M, Kaski K, Saramäki J (2008) Sequential algorithm for fast clique percolation. Phys Rev E 78(2):026109

    Article  Google Scholar 

  13. Leskovec J, Mcauley JJ (2012) Learning to discover social circles in ego networks. In: Advances in neural information processing systems, pp 539–547

  14. Li J, Hu X, Wu L, Liu H (2016) Robust unsupervised feature selection on networked data. In: Proceedings of the 2016 SIAM international conference on data mining. SIAM, pp 387–395

  15. Meng J, Fu D, Tang Y (2020) Belief-peaks clustering based on fuzzy label propagation. Appl Intell 50(4):1259–1271

    Article  Google Scholar 

  16. Newman ME (2004) Fast algorithm for detecting community structure in networks. Phys Rev E 69(6):066133

    Article  Google Scholar 

  17. Newman ME (2006) Modularity and community structure in networks. Proceed Nat Acad Sci 103(23):8577–8582

    Article  Google Scholar 

  18. Palla G, Derényi I, Farkas I, Vicsek T (2005) Uncovering the overlapping community structure of complex networks in nature and society. Nature 435(7043):814–818

    Article  Google Scholar 

  19. Pan Y, Hu G, Qiu J, Zhang Y, Wang S, Shao D, Pan Z (2020) Flgai: a unified network embedding framework integrating multi-scale network structures and node attribute information. Appl Intell 1–14

  20. Perozzi B, Al-Rfou R, Skiena S (2014) Deepwalk: Online learning of social representations. In: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 701–710

  21. Raghavan UN, Albert R, Kumara S (2007) Near linear time algorithm to detect community structures in large-scale networks. Phys Rev E 76(3):036106

    Article  Google Scholar 

  22. Rodriguez A, Laio A (2014) Clustering by fast search and find of density peaks. Science 344 (6191):1492–1496

    Article  Google Scholar 

  23. Starczewski A, Goetzen P, Er MJ (2020) A new method for automatic determining of the dbscan parameters. J Artif Intell Soft Comput Res 10

  24. Sugar CA, James GM (2003) Finding the number of clusters in a dataset: an information-theoretic approach. J Am Stat Assoc 98(463):750–763

    Article  MathSciNet  Google Scholar 

  25. Sun FY, Qu M, Hoffmann J, Huang CW, Tang J (2019) Vgraph: A generative model for joint community detection and node representation learning. arXiv:1906.07159

  26. Sun H, He F, Huang J, Sun Y, Li Y, Wang C, He L, Sun Z, Jia X (2020) Network embedding for community detection in attributed networks. ACM Trans Knowl Discov Data (TKDD) 14(3):1–25

    Article  Google Scholar 

  27. Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q (2015) Line: Large-scale information network embedding. In: Proceedings of the 24th international conference on world wide web, pp 1067–1077

  28. Tibshirani R, Walther G, Hastie T (2001) Estimating the number of clusters in a data set via the gap statistic. J R Stat Soc Series B (Stat Methodol) 63(2):411–423

    Article  MathSciNet  Google Scholar 

  29. Wang H, Yang Y, Liu B (2020) Gmc: Graph-based multi-view clustering. IEEE Trans Knowl Data Eng 32(6):1116–1129. https://doi.org/10.1109/TKDE.2019.2903810

    Article  Google Scholar 

  30. Wang X, Jin D, Cao X, Yang L, Zhang W (2016) Semantic community identification in large attribute networks. In: AAAI. Citeseer, pp 265–271

  31. Xie J, Girshick R, Farhadi A (2016) Unsupervised deep embedding for clustering analysis. In: International conference on machine learning, pp 478–487

  32. Yang J, McAuley J, Leskovec J (2013) Community detection in networks with node attributes. In: 2013 IEEE 13Th international conference on data mining. IEEE, pp 1151–1156

  33. Yang XH, Zhu QP, Huang YJ, Xiao J, Wang L, Tong FC (2017) Parameter-free laplacian centrality peaks clustering. Pattern Recogn Lett 100:167–173

    Article  Google Scholar 

  34. Yu Z, Zhang Z, Chen H, Shao J (2020) Structured subspace embedding on attributed networks. Inf Sci 512:726–740

    Article  Google Scholar 

  35. Zhang B, Yu Z, Zhang W (2020) Community-centric graph convolutional network for unsupervised community detection. IJCAI

  36. Zhang Y, Mańdziuk J, Quek CH, Goh BW (2017) Curvature-based method for determining the number of clusters. Inf Sci 415:414–428

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported in part by National Natural Science Foundation of China (No. 61773348, 61873240, and 61603340), and in part by the Public Welfare Technology Research Project in Zhe’jiang Province of China(No. LGG20F020017).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xu-Hua Yang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xu, XL., Xiao, YY., Yang, XH. et al. Attributed network community detection based on network embedding and parameter-free clustering. Appl Intell 52, 8073–8086 (2022). https://doi.org/10.1007/s10489-021-02779-4

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-021-02779-4

Keywords

Navigation