Abstract
In the GitHub open-source collaborative development scenario, each entity type and the link relationship between them have natural heterogeneous attributes. In order to improve the accuracy of project recommendation, it is necessary to effectively integrate this multi-source information. Therefore, for the project recommendation scenario, this paper defines an open source weighted heterogeneous information network to represent the different entity types and link relationships in the GitHub open source collaborative development scenario, and effectively model the complex interaction among developers, projects and other entities. Using the weighted heterogeneous information network embedding method, extract and use the rich structural and semantic information in the weighted heterogeneous open source information network to learn the node representation of developers and projects, and fuse the personalized nonlinear fusion function into the matrix decomposition model for open source project recommendation. Finally, this paper makes a large number of comparative experiments based on the real GitHub open data set, and compares it with other project recommendation methods to verify the effectiveness of our proposed open source project recommendation model. At the same time, it also explores the impact of different metapaths on the effect of project recommendation. The experimental results show that the recommendation method based on heterogeneous information network can effectively improve the recommendation quality.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Zhao, H., Li, N., Chen, Q., et al.: Projects and developers recommendation in open source ecosystem. J. Chin. Comput. Syst. 42(11), 2259–2268 (2021)
Gousios, G., Spinellis, D.: GHTorrent: GitHub’s data from a firehose. In: 2012 9th IEEE Working Conference on Mining Software Repositories (MSR), pp. 12–21 (2012)
Ma, Y., Bogart, C., Amreen, S., et al.: World of code: an infrastructure for mining the universe of open source VCS data. In: 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR), pp. 143–154 (2019)
Goggins, S.P., Germonprez, M., Lumbard, K.: Making open source project health transparent. Computer 54(08), 104–111 (2021)
Peterson, J., Krug, J.: Augur: a decentralized, open-source platform for prediction markets. arXiv preprint arXiv:1501.01042, p. 507 (2015)
Dueñas, S., Cosentino, V., Gonzalez-Barahona, J.M., et al.: GrimoireLab: a toolset for software development analytics. Peer J. Comput. Sci. 7, e601 (2021)
Guendouz, M., Amine, A., Hamou, R.M.: Recommending relevant open source projects on GitHub using a collaborative-filtering technique. Int. J. Open Source Softw. Process. (IJOSSP) 6(1), 1–16 (2015)
Zhang, Y., Lo, D., Kochhar, P.S., et al.: Detecting similar repositories on GitHub. In: 2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER), pp. 13–23. IEEE (2017)
Xu, W., Sun, X., Hu, J., et al.: REPERSP: recommending personalized software projects on GitHub. In: 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 648–652. IEEE (2017)
He, K., Ma, Y., Zhang, Y., Liu, H.: A data-based personalized mixed recommendation method for GitHub projects. J. Jilin Univ. Sci. Edn. 58(6), 1399–1406 (2020)
Yang, C., Fan, Q., Wang, T., et al.: RepoLike: personal repositories recommendation in social coding communities. In: Proceedings of the 8th Asia-Pacific Symposium on Internetware, pp. 54–62 (2016)
Zhang, P., Xiong, F., Leung, H., et al.: FunkR-pDAE: personalized project recommendation using deep learning. IEEE Trans. Emerg. Top. Comput. 9, 886–900 (2018)
Liu, C., Yang, D., Zhang, X., et al.: Recommending GitHub projects for developer onboarding. IEEE Access 6, 52082–52094 (2018)
Sun, Y., Han, J.: Mining heterogeneous information net-works: a structural analysis approach. ACM SIGKDD Explor. Newsl. 14(2), 20–28 (2013)
Jiang, Z., Liu, H., Fu, B., et al.: Recommendation in heterogeneous information networks based on generalized random walk model and Bayesian personalized ranking. In: Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, pp. 288–296 (2018)
Shi, C., Zhou, C., Kong, X., et al.: HeteRecom: a semantic-based recommendation system in heterogeneous networks. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1552–1555 (2012)
Shi, C., Kong, X., Huang, Y., et al.: HeteSim: a general framework for relevance measure in heterogeneous networks. IEEE Trans. Knowl. Data Eng. 26(10), 2479–2492 (2014)
Shi, C., Zhang, Z., Luo, P., et al.: Semantic path based personalized recommendation on weighted heterogeneous information networks. In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, pp. 453–462 (2015)
Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 855–864 (2016)
Tang, J., Qu, M., Wang, M.: Large-scale information network embedding. In: Proceedings of the 24th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, pp. 1067–1077 (2015)
Dong, Y., Chawla, N.V., Swami, A.: metapath2vec: scalable representation learning for heterogeneous networks. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 135–144 (2017)
Mikolov, T., Chen, K., Corrado, G., et al.: Efficient estimation of word representations in vector space [EB/OL]. arXiv preprint arXiv:1301.3781 (2013)
Shi, C., Hu, B., Zhao, W.X., et al.: Heterogeneous information network embedding for recommendation. IEEE Trans. Knowl. Data Eng. 31(2), 357–370 (2019)
Wang, Z., Liu, H., Du, Y., et al.: Unified embedding model over heterogeneous information network for personalized recommendation. In: IJCAI, pp. 3813–3819 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Lin, H., Liang, G., Wu, Y., Wu, B., Tian, C., Wang, W. (2023). Open Source Software Supply Chain Recommendation Based on Heterogeneous Information Network. In: Gainaru, A., Zhang, C., Luo, C. (eds) Benchmarking, Measuring, and Optimizing. Bench 2022. Lecture Notes in Computer Science, vol 13852. Springer, Cham. https://doi.org/10.1007/978-3-031-31180-2_5
Download citation
DOI: https://doi.org/10.1007/978-3-031-31180-2_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-31179-6
Online ISBN: 978-3-031-31180-2
eBook Packages: Computer ScienceComputer Science (R0)