Skip to main content
Log in

hier2vec: interpretable multi-granular representation learning for hierarchy in social networks

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Network representation learning (NRL) maps vertices into latent vector space for further network inference. The existing algorithms concern more about whether the vectors of two similar nodes be close in latent vector space while the hierarchy proximity has been largely neglected by them. The distribution of the representation vectors needs to reflect the hierarchical structural properties which widely exist in networks. In this paper, we propose a novel network representation learning framework that can encode the interpretable hierarchical structural semantics into the representation vectors. Specifically, we measure the distance and importance degree of nodes in the original network and map the nodes to a tree space. This makes the hierarchical structural relations in the original network be clearly revealed by the tree which is also of good interpretability. In this paper, the local structural proximities and the interpretable hierarchy knowledge are encoded into vector space by optimizing the objective function. Extensive experiments conducted on the realistic data sets demonstrate that the proposed approach outperforms the existing state-of-the-art approaches on tasks of node classification, link prediction, and visualization. Finally, a case study is conducted for further analysis about how the proposed model works.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Knoke D, Yang S (2019) Social network analysis, vol 154. Sage Publications, Thousand Oaks

    Google Scholar 

  2. Kuchler T, Russel D, Stroebel J (2020) The geographic spread of covid-19 correlates with structure of social networks as measured by facebook. Tech. rep, National Bureau of Economic Research

  3. Liu Y, Dehmamy N, Barabási AL (2020) Isotopy and energy of physical networks, Nat Phys pp. 1–7

  4. Zhang D, Yin J, Zhu X, Zhang C (2020) Network representation learning: a survey. IEEE Trans Big Data 6(1):3

    Article  Google Scholar 

  5. Hamilton WL, Ying R, Leskovec J (2017) Representation learning on graphs: methods and applications. IEEE Data Eng Bull 40(3):52

    Google Scholar 

  6. Cao S, Lu W, Xu Q (2015) in Proceedings of the 24th ACM international on conference on information and knowledge management, pp. 891–900

  7. Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q (2015) In: Proceedings of the 24th international conference on world wide web, pp. 1067–1077

  8. Huang X, Li J, Hu X (2017) Accelerated attributed network embedding, In: Proceedings of the 2017 SIAM international conference on data mining (SIAM, 2017), pp. 633–641

  9. Kipf TN, Welling M (2016) Semi-supervised classification with graph convolutional networks, arXiv preprint arXiv:1609.02907

  10. Wang X, Cui P, Wang J, Pei J, Zhu W, Yang S (2017) Community Preserving Network Embedding. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, February 4-9, 2017, San Francisco, California, USA, ed. by S.P. Singh, S. Markovitch (AAAI Press, 2017), pp. 203–209

  11. Ravasz E, Somera AL, Mongru DA, Oltvai ZN, Barabási AL (2002) Hierarchical organization of modularity in metabolic networks. Science 297(5586):1551

    Article  Google Scholar 

  12. Sales-Pardo M, Guimera R, Moreira AA, Amaral LAN (2007) Extracting the hierarchical organization of complex systems. Proc Natl Acad Sci 104(39):15224

    Article  Google Scholar 

  13. Lancichinetti A, Fortunato S, Kertész J (2009) Detecting the overlapping and hierarchical community structure in complex networks. New J Phys 11(3)

  14. Clauset A, Moore C, Newman ME (2008) Hierarchical structure and the prediction of missing links in networks. Nature 453(7191):98

    Article  Google Scholar 

  15. Chen T, Guestrin C (2016) Xgboost: A scalable tree boosting system, In: Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, pp. 785–794

  16. Knowles DA, Ghahramani Z (2014) Pitman yor diffusion trees for bayesian hierarchical clustering. IEEE Trans Pattern Anal Mach Intell 37(2):271

    Article  Google Scholar 

  17. Xu J, Wang G, Deng W (2016) DenPEHC: Density peak based efficient hierarchical clustering. Inform Sci 373:200

    Article  Google Scholar 

  18. Zachary WW (1977) An information flow model for conflict and fission in small groups. J Anthropol Res 33(4):452

    Article  Google Scholar 

  19. Yao JT, Vasilakos AV (1977) Pedrycz W (2013) Granular computing: perspectives and challenges. IEEE Trans Cybernet 43(6)

  20. Wang G, Xu J, Zhang Q, Liu Y (2015) Multi-granularity intelligent information processing, In: Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing (Springer, 2015), pp. 36–48

  21. Heimann M, Safavi T, Koutra D (2019) Distribution of Node Embeddings as Multiresolution Features for Graphs. In: 2019 IEEE international conference on data mining (ICDM). Beijing, China, IEEE, pp 289–298

  22. Li J, Huang C, Qi J, Qian Y, Liu W (2017) Three-way cognitive concept learning via multi-granularity. Inform Sci 378:244

    Article  Google Scholar 

  23. Hu J, Li T, Wang H, Fujita H (2016) Hierarchical cluster ensemble model based on knowledge granulation, Knowledge-Based Systems 91, 179. Three-way Decisions and Granular Computing

  24. Bouguettaya A, Yu Q, Liu X, Zhou X, Song A (2015) Efficient agglomerative hierarchical clustering. Expert Syst Appl 42(5):2785

    Article  Google Scholar 

  25. De Morsier F, Tuia D, Borgeaud M, Gass V, Thiran JP (2015) Cluster validity measure and merging system for hierarchical clustering considering outliers. Pattern Recognit 48(4):1478

    Article  Google Scholar 

  26. Tang XQ, Zhu P (2012) Hierarchical clustering problems and analysis of fuzzy proximity relation on granular space. IEEE Trans Fuzzy Syst 21(5):814

    Article  Google Scholar 

  27. Rodriguez A, Laio A (2014) Rodriguez, Alex and Laio Alessandro. Science 344(6191):1492

    Article  Google Scholar 

  28. Mirzaei A, Rahmati M (2009) A novel hierarchical-clustering-combination scheme based on fuzzy-similarity relations. IEEE Trans Fuzzy Syst 18(1):27

    Article  Google Scholar 

  29. Rashedi E, Mirzaei A (2013) A hierarchical clusterer ensemble method based on boosting theory. Knowl-Based Syst 45:83

    Article  Google Scholar 

  30. Tenenbaum JB, De Silva V, Langford JC (2000) A global geometric framework for nonlinear dimensionality reduction. science 290(5500):2319

  31. Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323

  32. Belkin M, Niyogi P (2002) Laplacian eigenmaps and spectral techniques for embedding and clustering. Adv Neural Inform Process Syst 585–591

  33. Perozzi B, Al-Rfou R, Skiena S (2014) Deepwalk: Online learning of social representations, In: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining (ACM, 2014), pp. 701–710

  34. Wang D, Cui P, Zhu W (2016) Structural deep network embedding, In: Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 1225–1234

  35. Yang C, Liu Z, Zhao D, Sun M, Chang E (2015) Network representation learning with rich text information, Twenty-Fourth international joint conference on. Artificial intelligence 2111–2117

  36. Zhang D, Yin J, Zhu X, Zhang C (2016) Collective classification via discriminative matrix factorization on sparsely labeled networks, In: Proceedings of the 25th ACM international on conference on information and knowledge management, pp. 1563–1572

  37. Zhu S, Yu K, Chi Y, Gong Y (2007) Combining content and link for classification using matrix factorization, In Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 487–494

  38. Huang W, Li Y, Fang Y, Fan J, Yang H (2020) BiANE: Bipartite Attributed Network Embedding, In Proceedings of the 43rd international ACM SIGIR conference on research and development in information retrieval, pp. 149–158

  39. Wu J, He J (2019) Scalable manifold-regularized attributed network embedding via maximum mean discrepancy, In Proceedings of the 28th ACM international conference on information and knowledge management, pp. 2101–2104

  40. Fu S, Xu J (2020) The Multi-granularity in Graph Revealed by a Generalized Leading Tree, arXiv preprint arXiv:2003.02708

  41. Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 3111–3119

  42. Cucerzan S (2007) Large-scale named entity disambiguation based on Wikipedia data, In Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL), pp. 708–716

  43. Cabanes C, Grouazel A, Von Schuckmann K, Hamon M, Turpin V, Coatanoan C, Paris F, Guinehut S, Boone C, Ferry N et al (2013) The CORA dataset validation and diagnostics of in-situ ocean temperature and salinity measurements. Ocean Sci 9(1):1

    Article  Google Scholar 

  44. Tang L, Liu H (2011) Leveraging social media networks for classification. Data Min Knowl Discovery 23(3):447

    Article  MathSciNet  Google Scholar 

  45. Fan RE, Chang KW, Hsieh CJ, Wang XR, Lin CJ (2008) LIBLINEAR: A library for large linear classification. J Mach Learn Res 9(Aug):1871

  46. Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient Estimation of Word Representations in Vector Space, In: 1st international conference on learning representations, ICLR 2013, Scottsdale, Arizona, USA, May 2-4, 2013, Workshop Track Proceedings, ed. by Y. Bengio, Y. LeCun. arXiv:1301.3781

  47. Grover A, Leskovec J (2016) node2vec: scalable feature learning for networks, In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, San Francisco, CA, USA, August 13-17, 2016 (ACM, 2016), pp. 855–864

  48. Ahmed A, Shervashidze N, Narayanamurthy S, Josifovski V, Smola AJ (2013) Distributed large-scale natural graph factorization. In: Proceedings of the 22nd international conference on World Wide Web (ACM, 2013), pp. 37–48

  49. Feng MH, Hsu C, Li CT, Yeh M, Lin S (2019) MARINE: Multi-relational network embeddings with relational proximity and node attributes, The World Wide Web Conference

  50. Barbieri N, Bonchi F, Manco G (2014) Who to follow and why: link prediction with explanations, In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 1266–1275

  51. Lichtenwalter RN, Lussier JT, Chawla NV (2010) New perspectives and methods in link prediction, In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 243–252

  52. Davis J, Goadrich M (2006) The relationship between Precision-Recall and ROC curves, In: Proceedings of the 23rd international conference on Machine learning, pp. 233–240

  53. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) Scikit-learn: Machine learning in Python. J Mach Learn Res (12(Oct):2825)

Download references

Acknowledgements

The authors would like to thank the editors and anonymous reviewers for their constructive comments. This work is supported in part by the National Science Foundation of China (grant no. 61936001, 61772096, 61966005), Graduate Research and Innovation Project Plan of Chongqing Municipal Education Commission (grant no. CYB18174) and Doctor Training Program of Chongqing University of Posts and Telecommunications (grant no. BYJS201809).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guoyin Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fu, S., Wang, G. & Xu, J. hier2vec: interpretable multi-granular representation learning for hierarchy in social networks. Int. J. Mach. Learn. & Cyber. 12, 2543–2557 (2021). https://doi.org/10.1007/s13042-021-01338-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-021-01338-0

Keywords

Navigation