Abstract
Network representation learning (NRL) maps vertices into latent vector space for further network inference. The existing algorithms concern more about whether the vectors of two similar nodes be close in latent vector space while the hierarchy proximity has been largely neglected by them. The distribution of the representation vectors needs to reflect the hierarchical structural properties which widely exist in networks. In this paper, we propose a novel network representation learning framework that can encode the interpretable hierarchical structural semantics into the representation vectors. Specifically, we measure the distance and importance degree of nodes in the original network and map the nodes to a tree space. This makes the hierarchical structural relations in the original network be clearly revealed by the tree which is also of good interpretability. In this paper, the local structural proximities and the interpretable hierarchy knowledge are encoded into vector space by optimizing the objective function. Extensive experiments conducted on the realistic data sets demonstrate that the proposed approach outperforms the existing state-of-the-art approaches on tasks of node classification, link prediction, and visualization. Finally, a case study is conducted for further analysis about how the proposed model works.
Similar content being viewed by others
References
Knoke D, Yang S (2019) Social network analysis, vol 154. Sage Publications, Thousand Oaks
Kuchler T, Russel D, Stroebel J (2020) The geographic spread of covid-19 correlates with structure of social networks as measured by facebook. Tech. rep, National Bureau of Economic Research
Liu Y, Dehmamy N, Barabási AL (2020) Isotopy and energy of physical networks, Nat Phys pp. 1–7
Zhang D, Yin J, Zhu X, Zhang C (2020) Network representation learning: a survey. IEEE Trans Big Data 6(1):3
Hamilton WL, Ying R, Leskovec J (2017) Representation learning on graphs: methods and applications. IEEE Data Eng Bull 40(3):52
Cao S, Lu W, Xu Q (2015) in Proceedings of the 24th ACM international on conference on information and knowledge management, pp. 891–900
Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q (2015) In: Proceedings of the 24th international conference on world wide web, pp. 1067–1077
Huang X, Li J, Hu X (2017) Accelerated attributed network embedding, In: Proceedings of the 2017 SIAM international conference on data mining (SIAM, 2017), pp. 633–641
Kipf TN, Welling M (2016) Semi-supervised classification with graph convolutional networks, arXiv preprint arXiv:1609.02907
Wang X, Cui P, Wang J, Pei J, Zhu W, Yang S (2017) Community Preserving Network Embedding. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, February 4-9, 2017, San Francisco, California, USA, ed. by S.P. Singh, S. Markovitch (AAAI Press, 2017), pp. 203–209
Ravasz E, Somera AL, Mongru DA, Oltvai ZN, Barabási AL (2002) Hierarchical organization of modularity in metabolic networks. Science 297(5586):1551
Sales-Pardo M, Guimera R, Moreira AA, Amaral LAN (2007) Extracting the hierarchical organization of complex systems. Proc Natl Acad Sci 104(39):15224
Lancichinetti A, Fortunato S, Kertész J (2009) Detecting the overlapping and hierarchical community structure in complex networks. New J Phys 11(3)
Clauset A, Moore C, Newman ME (2008) Hierarchical structure and the prediction of missing links in networks. Nature 453(7191):98
Chen T, Guestrin C (2016) Xgboost: A scalable tree boosting system, In: Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, pp. 785–794
Knowles DA, Ghahramani Z (2014) Pitman yor diffusion trees for bayesian hierarchical clustering. IEEE Trans Pattern Anal Mach Intell 37(2):271
Xu J, Wang G, Deng W (2016) DenPEHC: Density peak based efficient hierarchical clustering. Inform Sci 373:200
Zachary WW (1977) An information flow model for conflict and fission in small groups. J Anthropol Res 33(4):452
Yao JT, Vasilakos AV (1977) Pedrycz W (2013) Granular computing: perspectives and challenges. IEEE Trans Cybernet 43(6)
Wang G, Xu J, Zhang Q, Liu Y (2015) Multi-granularity intelligent information processing, In: Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing (Springer, 2015), pp. 36–48
Heimann M, Safavi T, Koutra D (2019) Distribution of Node Embeddings as Multiresolution Features for Graphs. In: 2019 IEEE international conference on data mining (ICDM). Beijing, China, IEEE, pp 289–298
Li J, Huang C, Qi J, Qian Y, Liu W (2017) Three-way cognitive concept learning via multi-granularity. Inform Sci 378:244
Hu J, Li T, Wang H, Fujita H (2016) Hierarchical cluster ensemble model based on knowledge granulation, Knowledge-Based Systems 91, 179. Three-way Decisions and Granular Computing
Bouguettaya A, Yu Q, Liu X, Zhou X, Song A (2015) Efficient agglomerative hierarchical clustering. Expert Syst Appl 42(5):2785
De Morsier F, Tuia D, Borgeaud M, Gass V, Thiran JP (2015) Cluster validity measure and merging system for hierarchical clustering considering outliers. Pattern Recognit 48(4):1478
Tang XQ, Zhu P (2012) Hierarchical clustering problems and analysis of fuzzy proximity relation on granular space. IEEE Trans Fuzzy Syst 21(5):814
Rodriguez A, Laio A (2014) Rodriguez, Alex and Laio Alessandro. Science 344(6191):1492
Mirzaei A, Rahmati M (2009) A novel hierarchical-clustering-combination scheme based on fuzzy-similarity relations. IEEE Trans Fuzzy Syst 18(1):27
Rashedi E, Mirzaei A (2013) A hierarchical clusterer ensemble method based on boosting theory. Knowl-Based Syst 45:83
Tenenbaum JB, De Silva V, Langford JC (2000) A global geometric framework for nonlinear dimensionality reduction. science 290(5500):2319
Roweis ST, Saul LK (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323
Belkin M, Niyogi P (2002) Laplacian eigenmaps and spectral techniques for embedding and clustering. Adv Neural Inform Process Syst 585–591
Perozzi B, Al-Rfou R, Skiena S (2014) Deepwalk: Online learning of social representations, In: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining (ACM, 2014), pp. 701–710
Wang D, Cui P, Zhu W (2016) Structural deep network embedding, In: Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 1225–1234
Yang C, Liu Z, Zhao D, Sun M, Chang E (2015) Network representation learning with rich text information, Twenty-Fourth international joint conference on. Artificial intelligence 2111–2117
Zhang D, Yin J, Zhu X, Zhang C (2016) Collective classification via discriminative matrix factorization on sparsely labeled networks, In: Proceedings of the 25th ACM international on conference on information and knowledge management, pp. 1563–1572
Zhu S, Yu K, Chi Y, Gong Y (2007) Combining content and link for classification using matrix factorization, In Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 487–494
Huang W, Li Y, Fang Y, Fan J, Yang H (2020) BiANE: Bipartite Attributed Network Embedding, In Proceedings of the 43rd international ACM SIGIR conference on research and development in information retrieval, pp. 149–158
Wu J, He J (2019) Scalable manifold-regularized attributed network embedding via maximum mean discrepancy, In Proceedings of the 28th ACM international conference on information and knowledge management, pp. 2101–2104
Fu S, Xu J (2020) The Multi-granularity in Graph Revealed by a Generalized Leading Tree, arXiv preprint arXiv:2003.02708
Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 3111–3119
Cucerzan S (2007) Large-scale named entity disambiguation based on Wikipedia data, In Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL), pp. 708–716
Cabanes C, Grouazel A, Von Schuckmann K, Hamon M, Turpin V, Coatanoan C, Paris F, Guinehut S, Boone C, Ferry N et al (2013) The CORA dataset validation and diagnostics of in-situ ocean temperature and salinity measurements. Ocean Sci 9(1):1
Tang L, Liu H (2011) Leveraging social media networks for classification. Data Min Knowl Discovery 23(3):447
Fan RE, Chang KW, Hsieh CJ, Wang XR, Lin CJ (2008) LIBLINEAR: A library for large linear classification. J Mach Learn Res 9(Aug):1871
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient Estimation of Word Representations in Vector Space, In: 1st international conference on learning representations, ICLR 2013, Scottsdale, Arizona, USA, May 2-4, 2013, Workshop Track Proceedings, ed. by Y. Bengio, Y. LeCun. arXiv:1301.3781
Grover A, Leskovec J (2016) node2vec: scalable feature learning for networks, In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, San Francisco, CA, USA, August 13-17, 2016 (ACM, 2016), pp. 855–864
Ahmed A, Shervashidze N, Narayanamurthy S, Josifovski V, Smola AJ (2013) Distributed large-scale natural graph factorization. In: Proceedings of the 22nd international conference on World Wide Web (ACM, 2013), pp. 37–48
Feng MH, Hsu C, Li CT, Yeh M, Lin S (2019) MARINE: Multi-relational network embeddings with relational proximity and node attributes, The World Wide Web Conference
Barbieri N, Bonchi F, Manco G (2014) Who to follow and why: link prediction with explanations, In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 1266–1275
Lichtenwalter RN, Lussier JT, Chawla NV (2010) New perspectives and methods in link prediction, In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 243–252
Davis J, Goadrich M (2006) The relationship between Precision-Recall and ROC curves, In: Proceedings of the 23rd international conference on Machine learning, pp. 233–240
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) Scikit-learn: Machine learning in Python. J Mach Learn Res (12(Oct):2825)
Acknowledgements
The authors would like to thank the editors and anonymous reviewers for their constructive comments. This work is supported in part by the National Science Foundation of China (grant no. 61936001, 61772096, 61966005), Graduate Research and Innovation Project Plan of Chongqing Municipal Education Commission (grant no. CYB18174) and Doctor Training Program of Chongqing University of Posts and Telecommunications (grant no. BYJS201809).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Fu, S., Wang, G. & Xu, J. hier2vec: interpretable multi-granular representation learning for hierarchy in social networks. Int. J. Mach. Learn. & Cyber. 12, 2543–2557 (2021). https://doi.org/10.1007/s13042-021-01338-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-021-01338-0