Abstract
Heterogeneous information networks (HIN), containing different types of entities with various kinds of interaction relations in between, provide richer information than homogeneous networks. Heterogeneous motifs are induced structural subgraph patterns with semantic in HINs. There has been many works using motifs to participate in the representation learning of HINs, but rarely to understand the respective influences of motifs. Due to the rich semantic information contained in heterogeneous motifs, the effects of different structures are inconsistent in network representation. In this paper, we introduce a case study on AMiner dataset, by extracting the heterogeneous motifs with various types of nodes and edges, especially four-node motifs, the relations between those motifs also are explored. During the study process, we first construct a set of motif instances identified by subgraph isomorphism algorithm as a weighted bipartite graph and then use another semantically related node type to extract target node features from pruned adjacency matrix. Next, a series of experiments are designed to evaluate the effect of each motif and the irrelevance of different motifs. Experimental results show that embeddings by our framework achieves excellent results compared with several state-of-the-art alternatives in node classification and clustering tasks.














Similar content being viewed by others
Availability of data and material
All of the materials including figures is owned by the authors and no permissions are required. The initial dataset is from the public dataset of website https://www.aminer.cn/aminernetwork.
Notes
The main symbols used in our framework are given in Table 4.
References
Ahmed, N.K., Rossi, R.A., Lee, J.B., Kong, X., Willke, T.L., Zhou, R., Eldardiry, H.: Learning role-based graph embeddings. stat 1050, 7 (2018)
Arora, S.: A survey on graph neural networks for knowledge graph completion. arXiv preprint arXiv:2007.12374 (2020)
Benson, A.R., Gleich, D.F., Leskovec, J.: Higher-order organization of complex networks. Science 353(6295), 163–166 (2016)
Carletti, V., Foggia, P., Saggese, A., Vento, M.: Challenging the time complexity of exact subgraph isomorphism for huge and dense graphs with vf3. IEEE Transactions on Pattern Analysis and Machine Intelligence 40(4), 804–818 (2018). https://doi.org/10.1109/TPAMI.2017.2696940
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. Journal of Artificial Intelligence Research 16, 321–357 (2002)
Dareddy, M.R., Das, M., Yang, H.: motif2vec: Motif aware node representation learning for heterogeneous networks. In: 2019 IEEE International Conference on Big Data (Big Data), pp. 1052–1059. IEEE (2019). https://doi.org/10.1109/BigData47090.2019.9005670
Dong, Y., Chawla, N.V., Swami, A.: Metapath2vec: Scalable representation learning for heterogeneous networks. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’17, p. 135–144. Association for Computing Machinery, New York, NY, USA (2017). https://doi.org/10.1145/3097983.3098036.
Dong, Y., Hu, Z., Wang, K., Sun, Y., Tang, J.: Heterogeneous network representation learning. In: IJCAI 20, 4861–4867 (2020)
Fan, S., Zhu, J., Han, X., Shi, C., Hu, L., Ma, B., Li, Y.: Metapath-guided heterogeneous graph neural network for intent recommendation. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2478–2486 (2019)
Goyal, P., Ferrara, E.: Graph embedding techniques, applications, and performance: A survey. Knowledge-Based Systems 151, 78–94 (2018). https://doi.org/10.1016/j.knosys.2018.03.022.
He, H., Bai, Y., Garcia, E.A., Li, S.: Adasyn: Adaptive synthetic sampling approach for imbalanced learning. In: 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), pp. 1322–1328 (2008). https://doi.org/10.1109/IJCNN.2008.4633969
Hosseini, A., Chen, T., Wu, W., Sun, Y., Sarrafzadeh, M.: Heteromed: Heterogeneous information network for medical diagnosis. In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management, pp. 763–772 (2018)
Hou, S., Ye, Y., Song, Y., Abdulhayoglu, M.: Hindroid: An intelligent android malware detection system based on structured heterogeneous information network. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pp. 1507–1515 (2017)
Huang, Z., Zheng, Y., Cheng, R., Sun, Y., Mamoulis, N., Li, X.: Meta structure: Computing relevance in large heterogeneous information networks. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2016)
Hulovatyy, Y., Chen, H., Milenković, T.: Exploring the structure and function of temporal networks with dynamic graphlets. Bioinformatics 31(12), i171–i180 (2015)
Kovanen, L., Karsai, M., Kaski, K., Kertész, J., Saramäki, J.: Temporal motifs in time-dependent networks. Journal of Statistical Mechanics: Theory and Experiment 2011(11), P11005 (2011)
Lichtenwalter, R.N., Chawla, N.V.: Vertex collocation profiles: subgraph counting for link analysis and prediction. In: Proceedings of the 21st international conference on World Wide Web, pp. 1019–1028 (2012)
Ling, C.X., Li, C.: Data mining for direct marketing: Problems and solutions. In: KDD (1998)
Mahendra Piraveenan Kishan Wimalawarne, D.K.: Centrality and composition of four-node motifs in metabolic networks. Procedia Computer Science 18, 409–418 (2013). https://doi.org/10.1016/j.procs.2013.05.204. 2013 International Conference on Computational Science
Milenković, T., Pržulj, N.: Uncovering biological network function via graphlet degree signatures. Cancer informatics 6, CIN–S680 (2008)
Milo, R., Shen-Orr, S., Itzkovitz, S., Kashtan, N., Chklovskii, D., Alon, U.: Network motifs: Simple building blocks of complex networks. Science 298(5594), 824–827 (2002). https://doi.org/10.1126/science.298.5594.824
Palla, G., Derényi, I., Farkas, I., Vicsek, T.: Uncovering the overlapping community structure of complex networks in nature and society. nature 435(7043), 814–818 (2005)
Radicchi, F., Castellano, C., Cecconi, F., Loreto, V., Parisi, D.: Defining and identifying communities in networks. Proceedings of the national academy of sciences 101(9), 2658–2663 (2004)
Rossi, R.A., Ahmed, N.K., Carranza, A., Arbour, D., Rao, A., Kim, S., Koh, E.: Heterogeneous network motifs. CoRR abs/1901.10026 (2019). http://arxiv.org/abs/1901.10026
Rossi, R.A., Ahmed, N.K., Carranza, A., Arbour, D., Rao, A., Kim, S., Koh, E.: Heterogeneous graphlets. ACM Transactions on Knowledge Discovery from Data (TKDD) 15(1), 1–43 (2020)
Rossi, R.A., Ahmed, N.K., Koh, E., Kim, S., Rao, A., Yadkori, Y.A.: Hone: Higher-order network embeddings. arXiv preprint arXiv:1801.09303 (2018)
Sankar, A., Zhang, X., Chang, K.C.C.: Meta-gnn: Metagraph neural network for semi-supervised learning in attributed heterogeneous information networks. In: Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM ’19, p. 137–144. Association for Computing Machinery, New York, NY, USA (2019). https://doi.org/10.1145/3341161.3342859
Shervashidze, N., Vishwanathan, S., Petri, T., Mehlhorn, K., Borgwardt, K.: Efficient graphlet kernels for large graph comparison. In: Artificial intelligence and statistics, pp. 488–495. PMLR (2009)
Shi, C., Zhang, Z., Luo, P., Yu, P.S., Yue, Y., Wu, B.: Semantic path based personalized recommendation on weighted heterogeneous information networks. In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, pp. 453–462 (2015)
Solava, R.W., Michaels, R.P., Milenković, T.: Graphlet-based edge clustering reveals pathogen-interacting proteins. Bioinformatics 28(18), i480–i486 (2012)
Sorokin, D., Gurevych, I.: Modeling semantics with gated graph neural networks for knowledge base question answering. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 3306–3317. Association for Computational Linguistics, Santa Fe, New Mexico, USA (2018). https://aclanthology.org/C18-1280. Accessed Feb 2022
Vishwanathan, S.V.N., Schraudolph, N.N., Kondor, R., Borgwardt, K.M.: Graph kernels. Journal of Machine Learning Research 11, 1201–1242 (2010)
Wang, X., Bo, D., Shi, C., Fan, S., Ye, Y., Yu, P.S.: A survey on heterogeneous graph embedding: methods, techniques, applications and sources. arXiv preprint arXiv:2011.14867 (2020)
Xiao, Y., Zhang, J., Deng, L.: Prediction of lncrna-protein interactions using hetesim scores based on heterogeneous networks. Scientific Reports 7(1), 3664 (2017)
Yuan, F., Wenqing, L., W., V., Min, W., Jiaqi, S., Kevin, C., Chen-Chuan, Xiao-Li, L.: Metagraph-based learning on heterogeneous graphs. IEEE Transactions on Knowledge and Data Engineering 33(1), 154–168 (2021). https://doi.org/10.1109/TKDE.2019.2922956
Zhang, C., Song, D., Huang, C., Swami, A., Chawla, N.V.: Heterogeneous graph neural network. Association for Computing Machinery, New York, NY, USA (2019). https://doi.org/10.1145/3292500.3330961.
Zhao, J., Wang, X., Shi, C., Liu, Z., Ye, Y.: Network schema preserving heterogeneous information network embedding. In: C. Bessiere (ed.) Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI-20, pp. 1366–1372. International Joint Conferences on Artificial Intelligence Organization (2020). https://doi.org/10.24963/ijcai.2020/190. Main track
Zhengdao, C., Lei, C., Soledad, V., Bruna, J.: Can graph neural networks count substructures? Advances in neural information processing systems (2020). https://par.nsf.gov/biblio/10233869
Zhou, Z.H., Liu, X.Y.: On multi-class cost-sensitive learning. Computational Intelligence 26 (2010)
Acknowledgements
The authors would like to acknowledge the support provided by the National Natural Science Foundation of China under Grant 61872222, the Key Research and Development Program of Shandong Province (2020CXGC010102), and the project ZR2020LZH011 supported by Shandong Provincial Natural Science Foundation.
Funding
This work is supported by the National Natural Science Foundation of China under Grant 61872222, the Key Research and Development Program of Shandong Province (2020CXGC010102), and the project ZR2020LZH011 supported by Shandong Provincial Natural Science Foundation.
Author information
Authors and Affiliations
Contributions
Siyuan Ye wrote the main manuscript text and prepared all figures and tables. Siyuan Ye, Guangxu Mei and Shijun Liu provided the Methodology. Qian Li, Shijun Liu and Li Pan provided writing-review and editing. Shijun Liu and Li Pan provided funding support.
Corresponding authors
Ethics declarations
Ethical Approval
This declaration is not applicable.
Conflicts of interest
I declare that all authors have no competing interests as defined by Springer, or other interests that might be perceived to influence the results and discussion reported in this paper.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ye, S., Li, Q., Mei, G. et al. How the four-nodes motifs work in heterogeneous node representation?. World Wide Web 26, 1707–1729 (2023). https://doi.org/10.1007/s11280-022-01115-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11280-022-01115-1