Skip to main content
Log in

Semi-supervised community detection on attributed networks using non-negative matrix tri-factorization with node popularity

  • Research Article
  • Published:
Frontiers of Computer Science Aims and scope Submit manuscript

Abstract

The World Wide Web generates more and more data with links and node contents, which are always modeled as attributed networks. The identification of network communities plays an important role for people to understand and utilize the semantic functions of the data. A few methods based on non-negative matrix factorization (NMF) have been proposed to detect community structure with semantic information in attributed networks. However, previous methods have not modeled some key factors (which affect the link generating process together), including prior information, the heterogeneity of node degree, as well as the interactions among communities. The three factors have been demonstrated to primarily affect the results. In this paper, we propose a semi-supervised community detection method on attributed networks by simultaneously considering these three factors. First, a semi-supervised non-negative matrix tri-factorization model with node popularity (i.e., PSSNMTF) is designed to detect communities on the topology of the network. And then node contents are integrated into the PSSNMTF model to find the semantic communities more accurately, namely PSSNMTFC. Parameters of the PSSNMTFC model is estimated by using the gradient descent method. Experiments on some real and artificial networks illustrate that our new method is superior over some related state-of-the-art methods in terms of accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Yang J, McAuley J, Leskovec J. Community detection in networks with node attributes. In: Proceedings of IEEE International Conference on Data Mining. 2013, 1151–1156

  2. Peel L, Larremore D B, Clauset A. The ground truth about metadata and community detection in networks. Science Advances, 2016, 3(5): e1602548

    Article  Google Scholar 

  3. Newman M E J, Clauset A. Structure and inference in annotated networks. Nature Communications, 2016, 7: 11863

    Article  Google Scholar 

  4. Bothorel C, Cruz J D, Magnani M, Micenkova B. Clustering attributed graphs: models, measures and methods. Network Science, 2015, 3(3): 408–444

    Article  Google Scholar 

  5. Moayedikia A. Multi-objective community detection algorithm with node importance analysis in attributed networks. Applied Soft Computing, 2018, 67: 434–451

    Article  Google Scholar 

  6. Atzmüller M. Subgroup and community analytics on attributed graphs. In: Proceedings of CEUR Workshop. 2015

  7. Boden B. Combined Clustering of Graph and Attribute Data. Rwth Aachen, 2012, 13–18

  8. Günnemann S, Boden B, Färber I, Seidl T. Efficient mining of combined subspace and subgraph clusters in graphs with feature vectors. In: Proceedings of Pacific-asia Conference on Knowledge Discovery and Data Mining. 2013, 261–275

  9. Günnemann S, Färber I, Boden B, Seidl T. Subspace clustering meets dense subgraph mining: a synthesis of two paradigms. In: Proceedings of 2010 IEEE International Conference on Data Mining. 2010, 845–850

  10. Chai B F, Wang J L, Xu J W, Li W B. Active semi-supervised community detection method based on link model. Journal of Computer Applications, 2017, 37(11): 3090–3094

    Google Scholar 

  11. Yang L, Cao X, Jin D, Wang X, Meng D. A unified semi-supervised community detection framework using latent space graph regularization. IEEE Transactions on Cybernetics, 2015, 45(11): 2585–2598

    Article  Google Scholar 

  12. Shi X H, Lu H T, He Y C, He S. Community detection in social network with pairwisely constrained symmetric non-negative matrix factorization. In: Proceedings of the IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. 2015, 541–546

  13. Liu X, Wang W J, He D X, Jiao P F, Jin D, Cannistraci C V. Semi-supervised community detection based on non-negative matrix factorization with node popularity. Information Sciences, 2017, 381: 304–321

    Article  Google Scholar 

  14. Liu W Y, Yue K, Liu H, Zhang P. Associative categorization of frequent patterns based on the probabilistic graphical model. Frontiers of Computer Science, 2014, 8(2): 265–278

    Article  MathSciNet  Google Scholar 

  15. Combe D, Largeron C, Egyed-Zsigmond E, Géry M. Combining relations and text in scientific network clustering. In: Proceedings of International Conference on Advances in Social Networks Analysis and Mining. 2012, 1248–1253

  16. Dang T, Viennet E. Community detection based on structural and attribute similarities. In: Proceedings of International Conference on Digital Society. 2012, 7–12

  17. Neville J, Adler M, Jensen D. Clustering relational data using attribute and link information. In: Proceedings of International Joint Conference on Text Mining and Link Analysis Workshop. 2003

  18. Muslim N. A combination approach to community detection in social networks by utilizing structural and attribute data. Social Networking, 2016, 5(1): 11–15

    Article  Google Scholar 

  19. Elhadi H, Agam G. Structure and attributes community detection: comparative analysis of composite, ensemble and selection methods. In: Proceedings of Workshop on Social Network Mining and Analysis. 2013, 1–10

  20. Strehl A, Ghosh J. Cluster ensembles — a knowledge reuse framework for combining multiple partitions. Journal of Machine Learning Research, 2003, 3(3): 583–617

    MathSciNet  MATH  Google Scholar 

  21. Xu Z Q, Ke Y P, Wang Y, Cheng H. A model-based approach to attributed graph clustering. In: Proceedings of ACM Sigmod International Conference on Management of Data. 2012, 505–516

  22. Xu Z Q, Ke Y P, Wang Y, Cheng H, Cheng J. GBAGC: a general bayesian framework for attributed graph clustering. ACM Transactions on Knowledge Discovery from Data, 2014, 9(1): 1–43

    Article  Google Scholar 

  23. Yu L, Wu B, Wang B. Topic model-based link community detection with adjustable range of overlapping. In: Proceedings of International Conference on Advances in Social Networks Analysis and Mining. 2013, 1437–1438

  24. Liu L, Peng T. Clustering-based topical Web crawling using CFu-tree guided by link-context. Frontiers of Computer Science, 2014, 8(4): 581–595

    Article  MathSciNet  Google Scholar 

  25. Zhu S H, Yu K, Chi Y, Gong Y H. Combining content and link for classification using matrix factorization. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 2007, 487–494

  26. Yang T B, Jin R, Chi Y, Zhu S J. Combining link and content for community detection. In: Proceedings of Encyclopedia of Social Network Analysis and Mining. 2017, 1–10

  27. Liu D, Liu X, Wang W J, Bai H Y. Semi-supervised community detection based on discrete potential theory. Physica A: Statistical Mechanics and Its Applications, 2014, 416: 173–182

    Article  Google Scholar 

  28. Ma X K, Gao L, Yong X R, Fu L D. Semi-supervised clustering algorithm for community structure detection in complex networks. Physica A: Statistical Mechanics and Its Applications, 2010, 389(1): 187–197

    Article  Google Scholar 

  29. Deng X L, Wen Y, Chen Y H. Highly efficient epidemic spreading model based LPA threshold community detection method. Neurocomputing, 2016, 210: 3–12

    Article  Google Scholar 

  30. Wang X, Cui P, Wang J, Pei J. Community preserving network embedding. In: Proceedings of AAAI Conference on Artificial Intelligence. 2017

  31. Wang W J, Liu X, Jiao P F, Chen X, Jin D. A unified weakly supervised framework for community detection and semantic matching. In: Proceedings of Pacific-asia Conference on Knowledge Discovery and Data Mining. 2018, 218–230

  32. Brunet J P, Tamayo P, Golub T R, Mesirov J P. Metagenes and molecular pattern discovery using matrix factorization. Proceedings of the National Academy of Sciences, 2004, 101(12): 4164–4169

    Article  Google Scholar 

  33. Cavallari S, Zheng W S, Cai H Y, Chang C C. Learning community embedding with community detection and node embedding on graphs. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. 2017, 377–386

  34. Eaton E, Mansbach R. A spin-glass model for semi-supervised community detection. In: Proceedings of the 26th AAAI Conference on Artificial Intelligence. 2012, 900–906

  35. Jin H, Yu W, Li S J. Graph regularized nonnegative matrix tri-factorization for overlapping community detection. Physica A: Statistical Mechanics and Its Applications, 2019, 515: 376–387

    Article  MathSciNet  Google Scholar 

  36. Pei Y, Chakraborty N, Sycara K P. Nonnegative matrix tri-factorization with graph regularization for community detection in social networks. In: Proceedings of International Conference on Artificial Intelligence. 2015

  37. Zhu S H, Yu K, Chi Y, Gong Y H. Combining content and link for classification using matrix factorization. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 2007, 487–494

  38. Wang X, Jin D, Cao X C, Yang L. Semantic community identification in large attribute networks. In: Proceedings of the 30th AAAI Conference on Artificial Intelligence. 2016, 265–271

  39. Wu Q Y, Wang Z Y, Li C S, Ye Y M. Protein functional properties prediction in sparsely-label PPI networks through regularized non-negative matrix factorization. BMC Systems Biology, 2015, 9(S1): S9

    Article  Google Scholar 

  40. Wang R S, Zhang S H, Wang Y, Zhang X S, Chen L N. Clustering complex networks and biological networks by nonnegative matrix factorization with various similarity measures. Neurocomputing, 2008, 72(1–3): 134–141

    Article  Google Scholar 

  41. Zhang Y, Du N, Ge L, Jia K B. A collective NMF method for detecting protein functional module from multiple data sources. In: Proceedings of ACM Conference on Bioinformatics. 2012, 655–660

  42. Chin P, Rao A, Vu V. Stochastic block model and community detection in the sparse graphs: a spectral algorithm with optimal rate of recovery. In: Proceedings of Conference on Learning Theory. 2015, 391–423

  43. Cao J X, Jin D, Yang L, Dang J W. Incorporating network structure with node contents for community detection on large networks using deep learning. Neurocomputing, 2018, 297: 71–81

    Article  Google Scholar 

  44. Wang D D, Li T, Zhu S G, Ding C H Q. Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 2008, 307–314

  45. Perozzi B, Al-Rfou R, Skiena S. DeepWalk: online learning of social representations. In: Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2014, 701–710

  46. Tang J, Qu M, Wang M Z, Zhang M. Line: large-scale information network embedding. In: Proceedings of the 24th International Conference on World Wide Web. 2015, 1067–1077

  47. Grover A, Leskovec J. node2vec: scalable feature learning for networks. In: Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016, 855–864

  48. Zhang Z Y, Sun K D, Wang S Q. Enhanced community structure detection in complex networks with partial background information. Scientific Reports, 2013, 3(1): 3241

    Article  Google Scholar 

Download references

Acknowledgements

This work was partly supported by the National Natural Science Foundation of China (Grant Nos. 61876128, 61772361) and the National Science Foundation of Hebei (F2019403070) and the science and technology research project for universities of Hebei (ZD2020175).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bianfang Chai.

Additional information

Di Jin received his BS, MS, and PhD degrees in computer science from Jilin University, China in 2005, 2008, and 2012, respectively. He was a postdoctoral research fellow at the School of Design, Engineering, and Computer, Bournemouth University, U.K., from 2013 to 2014. He is currently an associate professor with the College of Intelligence and Computing, Tianjin University, China. He has published more than 50 papers in international journals and conferences in the areas of community detection, social network analysis, and machine learning.

Jing He received the BS degrees from Guangxi University, China in 2016 and MS degree in computer science and technology from Tianjin University, China in 2020. Her research interests are mainly related to community detection on social networks.

Bianfang Chai received the BE and ME degrees in computer science from Hebei University, China in 2002 and 2006 respectively, and the PhD degree from Beijing Jiaotong University, China in 2015. She is an associate professor with the School of Information Engineering, Hebei GEO University, China. Her current research interests include community detection, semi-supervised clustering, probabilistic graphical model, and complex network analysis.

Dongxiao He received her BS, MS, and PhD degrees in computer science from Jilin University, China in 2007, 2010, and 2014, respectively. She was a post-doctoral research fellow in Department of Computer Science, Dresden University of Technology, Germany from 2014 to 2015. She is an associate professor with the College of Intelligence and Computing, Tianjin University, China. She has published over 40 international journal and conference papers. Her current research interests include data mining and analysis of complex networks.

Electronic supplementary material

11704_2020_9203_MOESM1_ESM.pdf

Semi-supervised community detection on attributed networks using non-negative matrix tri-factorization with node popularity

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jin, D., He, J., Chai, B. et al. Semi-supervised community detection on attributed networks using non-negative matrix tri-factorization with node popularity. Front. Comput. Sci. 15, 154324 (2021). https://doi.org/10.1007/s11704-020-9203-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11704-020-9203-0

Keywords

Navigation