Skip to main content
Log in

Finding Communities by Decomposing and Embedding Heterogeneous Information Network

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Community discovery is an important task in social network analysis. However, most existing methods for community discovery rely on the topological structure alone. These methods ignore the rich information available in the content data. In order to solve this issue, in this paper, we present a community discovery method based on heterogeneous information network decomposition and embedding. Unlike traditional methods, our method takes into account topology, node content and edge content, which can supply abundant evidence for community discovery. First, an embedding-based similarity evaluation method is proposed, which decomposes the heterogeneous information network into several subnetworks, and extracts their potential deep representation to evaluate the similarities between nodes. Second, a bottom-up community discovery algorithm is proposed. Via leader nodes selection, initial community generation, and community expansion, communities can be found more efficiently. Third, some incremental maintenance strategies for the changes of networks are proposed. We conduct experimental studies based on three real-world social networks. Experiments demonstrate the effectiveness and the efficiency of our proposed method. Compared with the traditional methods, our method improves normalized mutual information (NMI) and the modularity by an average of 12% and 37% respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Nowicki K, Snijders T A B. Estimation and prediction for stochastic blockstructures. Journal of the American Statistical Association, 2001, 96(455): 1077-1087.

    Article  MathSciNet  Google Scholar 

  2. Airoldi E M, Blei D M, Fienberg S E, Xing E P, Jaakkola T. Mixed membership stochastic block models for relational data with application to protein-protein interactions. In Proc. the International Biometrics Society Annual Meeting, July 2006.

  3. Hofman J M, Wiggins C H. A Bayesian approach to network modularity. Phy. Rev. Letters, 2008, 100(25): Article No. 258701.

  4. Ren W, Yan G, Liao X, Xiao L. Simple probabilistic algorithm for detecting community structure. Phys. Rev. E Stat. Nonlin Soft Matter Phys., 2009, 79(3): Article No. 036111.

  5. Zhang Z, Cui P, Pei J, Wang X, Zhu W. TIMERS: Error-bounded SVD restart on dynamic networks. In Proc. the 32nd AAAI Conference on Artificial Intelligence, February 2018, pp.224-231.

  6. Yang Z, Hao T, Dikmen O, Chen X, Oja E. Clustering by nonnegative matrix factorization using graph random walk. In Proc. the 26th International Conference on Neural Information Processing Systems, December 2012, pp.1088-1096.

  7. Qiao S, Han N, Gao Y, Li R, Huang J, Guo J, Gutierrez L, Wu X. A fast parallel community discovery model on complex networks through approximate optimization. IEEE Transactions on Knowledge and Data Engineering, 2018, 30(9): 1638-1651.

    Article  Google Scholar 

  8. Nguyen N P, Dinh T N, Xuan Y, Thai M. Adaptive algorithms for detecting community structure in dynamic social networks. In Proc. the 30th IEEE International Conference on Computer Communications, Joint Conference of the IEEE Computer and Communications Societies, April 2011, pp.2282-2290.

  9. Hou B, Wang Z, Chen Q, Suo B, Fang C, Li Z, lves Z. Efficient maximal clique enumeration over graph data. Data Science and Engineering, 2016, 1(4): 219-230.

    Article  Google Scholar 

  10. Rossetti G, Cazabet R. Community discovery in dynamic networks: A survey. ACM Computing Surveys, 2018, 51(2): Article No. 35.

    Article  Google Scholar 

  11. Ruan Y, Fuhry D, Parthasarathy S. Efficient community detection in large networks using content and links. In Proc. the 22nd International Conference on World Wide Web, May 2013, pp.1089-1098.

  12. Yang J, McAuley J, Leskovec J. Community detection in networks with node attributes. In Proc. the 13th IEEE International Conference on Data Mining, December 2013, pp.1151-1156.

  13. Tian F, Gao B, Cui Q, Chen E, Liu T. Learning deep representations for graph clustering. In Proc. the 28th AAAI Conference on Artificial Intelligence, July 2014, pp.1293-1299.

  14. Wang X, Jin D, Cao X, Yang L, Zhang W. Semantic community identification in large attribute networks. In Proc. the 30th AAAI Conference on Artificial Intelligence, February 2016, pp.265-271.

  15. He D, Feng Z, Jin D, Wang X, Zhang W. Joint identification of network communities and semantics via integrative modeling of network topologies and node contents. In Proc. the 31st AAAI Conference on Artificial Intelligence, February 2017, pp.116-124.

  16. Pei Y, Chakraborty N, Sycara K. Nonnegative matrix tri-factorization with graph regularization for community detection in social networks. In Proc. the 24th International Joint Conference on Artificial Intelligence, July 2015, pp.2083-2089.

  17. Zhang G, Jin D, Gao J, Jiao P, Fogelman-Soulié F F, Huang X. Finding communities with hierarchical semantics by distinguishing general and specialized topic. In Proc. the 27th International Joint Conference on Artificial Intelligence, July 2018, pp.3648-3654.

  18. Li J, Dani H, Hu X, Tang J, Chang Y, Liu H. Attributed network embedding for learning in a dynamic environment. In Proc. the 2017 ACM on Conference on Information and Knowledge Management, November 2017, pp.387-396.

  19. Wang D, Cui P, Zhu W. Structural deep network embedding. In Proc. the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 2016, pp.1225-1234.

  20. Wang X, Cui P, Wang J, Pei J, Zhu W, Yang S. Community preserving network embedding. In Proc. the 31st AAAI Conference on Artificial Intelligence, February 2017, pp.203-209.

  21. Jin D, Ge M, Li Z, Lu W, He D, Fogelman-Soulié F. Using deep learning for community discovery in social networks. In Proc. the 29th IEEE International Conference on Tools with Artificial Intelligence, November 2017, pp.160-167.

  22. Li Y, Sha C, Huang X, Zhang Y. Community detection in attributed graphs: An embedding approach. In Proc. the 32nd AAAI Conference on Artificial Intelligence, February 2018, pp.338-345.

  23. Salakhutdinov R, Hinton G. Semantic hashing. International Journal of Approximate Reasoning, 2009, 50(7): 969-978.

    Article  Google Scholar 

  24. Khorasgani R R, Chen J, Zaïane O R. Top leaders community detection approach in information networks. In Proc. the 4th SNA-KDD Workshop on Social Network Mining and Analysis, July 2010.

  25. Brin S, Page L. The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems, 1998, 30(1): 107-117.

    Article  Google Scholar 

  26. Lancichinetti A, Fortunato S, Kertesz J. Detecting the overlapping and hierarchical community structure of complex networks. New Journal of Physics, 2009, 11(3): Article No. 033015.

    Article  Google Scholar 

  27. Law M H C, Jain A K. Incremental nonlinear dimensionality reduction by manifold learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, 28(3): 377-391.

    Article  Google Scholar 

  28. Bahmani B, Chowdhury A, Goel A. Fast incremental and personalized PageRank. Proceedings of the VLDB Endowment, 2010, 4(3): 173-184.

    Article  Google Scholar 

  29. Newman M E J. Modularity and community structure in networks. Proceedings of the National Academy of Sciences of the United States of America, 2006, 103(23): 8577-8582.

    Article  Google Scholar 

Download references

Acknowledgements

We are grateful to Prof. Quinn Snell of Brigham Young University, USA, for his encouragement. We also thank Guang-Bin Liu, Qiang Liao and Hao Dong of Northeastern University, China, for their efforts on this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dong Li.

Electronic supplementary material

ESM 1

(PDF 111 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kou, Y., Shen, DR., Li, D. et al. Finding Communities by Decomposing and Embedding Heterogeneous Information Network. J. Comput. Sci. Technol. 35, 320–337 (2020). https://doi.org/10.1007/s11390-020-9957-8

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-020-9957-8

Keywords

Navigation