On the Utilization of Structural and Textual Information of a Scientific Knowledge Graph to Discover Future Research Collaborations: A Link Prediction Perspective

Giarelis, Nikolaos; Kanakaris, Nikos; Karacapilidis, Nikos

doi:10.1007/978-3-030-61527-7_29

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12323))

Included in the following conference series:

International Conference on Discovery Science

1757 Accesses

Abstract

We consider the discovery of future research collaborations as a link prediction problem applied on scientific knowledge graphs. Our approach integrates into a single knowledge graph both structured and unstructured textual data through a novel representation of multiple scientific documents. The Neo4j graph database is used for the representation of the proposed scientific knowledge graph. For the implementation of our approach, we use the Python programming language and the scikit-learn ML library. We benchmark our approach against classical link prediction algorithms using accuracy, recall, and precision as our performance metrics. Our initial experimentations demonstrate a significant improvement of the accuracy of the future collaboration prediction task. The experimentations reported in this paper use the new COVID-19 Open Research Dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Medical Knowledge Graphs in the Discovery of Future Research Collaborations

Forecasting the future of artificial intelligence with machine learning-based link prediction in an exponentially growing knowledge network

Article Open access 16 October 2023

Prediction of new scientific collaborations through multiplex networks

Article Open access 13 May 2021

References

Adamic, L.A., Adar, E.: Friends and neighbors on the Web. Soc. Networks 25, 211–230 (2003)
Article Google Scholar
Aggarwal, C.C.: Machine Learning for Text. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73531-3
Book MATH Google Scholar
Albert, R., Barabási, A.: Statistical mechanics of complex networks. ArXiv, cond-mat/0106096 (2001)
Google Scholar
Arnab, S., Zhihong, S., Yang Song, H.M., Darrin Eide, B.H., Kuansan, W.: An overview of microsoft academic service (MAS) and applications. In: Proceedings of the 24th International Conference on World Wide Web (WWW 2015 Companion), pp. 243–246. ACM, New York (2015)
Google Scholar
Fire, M., et al.: Link prediction in social networks using computationally efficient topological features. In: 2011 IEEE Third International Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third International Conference on Social Computing, pp. 73–80 (2011)
Google Scholar
Giarelis, N., Kanakaris, N., Karacapilidis, N.: An innovative graph-based approach to advance feature selection from multiple textual documents. In: Maglogiannis, I., Iliadis, L., Pimenidis, E. (eds.) AIAI 2020. IAICT, vol. 583, pp. 96–106. Springer, Cham (2020a). https://doi.org/10.1007/978-3-030-49161-1_9
Chapter Google Scholar
Giarelis, N., Kanakaris, N., Karacapilidis, N.: On a novel representation of multiple textual documents in a single graph. In: Czarnowski, I., Howlett, Robert J., Jain, Lakhmi C. (eds.) IDT 2020. SIST, vol. 193, pp. 105–115. Springer, Singapore (2020b). https://doi.org/10.1007/978-981-15-5925-9_9
Chapter Google Scholar
Guns, R., Rousseau, R.: Recommending research collaborations using link prediction and random forest classifiers. Scientometrics 101(2), 1461–1473 (2014). https://doi.org/10.1007/s11192-013-1228-9
Article Google Scholar
Hamilton, W., Ying, Z., Leskovec, J.: Inductive representation learning on large graphs. In Advances in neural information processing systems, pp. 1024–1034 (2017)
Google Scholar
Huang, J., Zhuang, Z., Li, J., and Giles, C. L.: Collaboration over time: characterizing and modeling network evolution. In: Proceedings of the 2008 international conference on web search and data mining, pp. 107–116 (2008)
Google Scholar
Jaccard, P.: Étude comparative de la distribution florale dans une portion des Alpes et des Jura. Bull Soc Vandoise Sci Nat 37, 547–579 (1901)
Google Scholar
Julian, K., Lu, W.: Application of machine learning to link prediction (2016)
Google Scholar
Kanterakis, A., et al.: Towards reproducible bioinformatics: the OpenBio-C scientific workflow environment. In: Proceedings of the 19th IEEE International Conference on Bioinformatics and Bioengineering (BIBE), Athens, Greece, pp. 221–226 (2019)
Google Scholar
Li, S., Huang, J., Zhang, Z., Liu, J., Huang, T., Chen, H.: Similarity-based future common neighbors model for link prediction in complex networks. Sci. Rep. 8, 1–11 (2018)
Article Google Scholar
Liben-Nowell, D., Kleinberg, J.M.: The link-prediction problem for social networks. J. Am. Soc. Inform. Sci. Technol. 58, 1019–1031 (2007)
Article Google Scholar
Manghi, P., et al.: OpenAIRE Research Graph Dump (Version 1.0.0-beta) [Data set]. Zenodo. (2019). http://doi.org/10.5281/zenodo.3516918
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems (NeurIPS), pp. 3111–3119 (2013)
Google Scholar
Nathani, D., Chauhan, J., Sharma, C., Kaul, M.: Learning attention-based embeddings for relation prediction in knowledge graphs. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 4710–4723 (2019)
Google Scholar
Nikolentzos, G., Meladianos, P., Vazirgiannis, M.: Matching node embeddings for graph similarity. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)
Google Scholar
Panagopoulos, G., Tsatsaronis, G., Varlamis, I.: Detecting rising stars in dynamic collaborative networks. J. Informetrics 11, 198–222 (2017)
Article Google Scholar
Ponomariov, B., Boardman, C.: What is co-authorship? Scientometrics 109(3), 1939–1963 (2016). https://doi.org/10.1007/s11192-016-2127-7
Article Google Scholar
Rousseau, F., Kiagias, E., Vazirgiannis, M.: Text categorization as a graph classification problem. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, vol. 1, pp. 1702–1712 (2015)
Google Scholar
Rousseau, F., Vazirgiannis, M.: Graph-of-word and TW-IDF: new approach to ad hoc IR. In: Proceedings of the 22nd ACM International Conference on Information & Knowledge Management, pp. 59–68, ACM Press (2013)
Google Scholar
Sun, Y., Barber, R., Gupta, M., Aggarwal, C.C., Han, J.: Co-author relationship prediction in heterogeneous bibliographic networks. In: 2011 International Conference on Advances in Social Networks Analysis and Mining, pp. 121–128 IEEE (2011)
Google Scholar
Vahdati, S., Palma, G., Nath, R.J., Lange, C., Auer, S., Vidal, M.-E.: Unveiling scholarly communities over knowledge graphs. In: Méndez, E., Crestani, F., Ribeiro, C., David, G., Lopes, J.C. (eds.) TPDL 2018. LNCS, vol. 11057, pp. 103–115. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00066-0_9
Chapter Google Scholar
Vathy-Fogarassy, Á., Abonyi, J.: Graph-based clustering and data visualization algorithms. Springer, London (2013). https://doi.org/10.1007/978-1-4471-5158-6
Book MATH Google Scholar
Veira, N., Keng, B., Padmanabhan, K., Veneris, A.: Unsupervised embedding enhancements of knowledge graphs using textual associations. In: Proceedings of the 28th International Joint Conference on Artificial Intelligence, pp. 5218–5225. AAAI Press (2019)
Google Scholar
Wang, L., et al.: CORD-19: The Covid-19 Open Research Dataset. arXiv preprint arXiv:2004.10706 (2020)
Wang, Q., Mao, Z., Wang, B., Guo, L.: Knowledge graph embedding: a survey of approaches and applications. IEEE Trans. Knowl. Data Eng. 29(12), 2724–2743 (2017)
Article Google Scholar
Wang, Z., Li, J., Liu, Z., Tang, J.: Text-enhanced representation learning for knowledge graph. In: Proceedings of International Joint Conference on Artificial Intelligent (IJCAI), pp. 4–17 (2016)
Google Scholar
Yu, Q., Long, C., Lv, Y., Shao, H., He, P., Duan, Z.: Predicting co-author relationship in medical co-authorship networks. PLoS ONE 9(7), 101214 (2014)
Article Google Scholar

Download references

Acknowledgments

The work presented in this paper is supported by the OpenBio-C project (www.openbio.eu), which is co-financed by the European Union and Greek national funds through the Operational Program Competitiveness, Entrepreneurship and Innovation, under the call RESEARCH – CREATE – INNOVATE (Project id: T1EDK- 05275). The authors would also like to thank Stamatis Karlos for his assistance with the statistical analysis of the data.

Author information

Authors and Affiliations

Industrial Management and Information Systems Lab, MEAD, University of Patras, 26504, Rio Patras, Greece
Nikolaos Giarelis, Nikos Kanakaris & Nikos Karacapilidis

Authors

Nikolaos Giarelis
View author publications
You can also search for this author in PubMed Google Scholar
Nikos Kanakaris
View author publications
You can also search for this author in PubMed Google Scholar
Nikos Karacapilidis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nikos Karacapilidis .

Editor information

Editors and Affiliations

University of Bari Aldo Moro, Bari, Italy
Annalisa Appice
Aristotle University of Thessaloniki, Thessaloniki, Greece
Grigorios Tsoumakas
Open University of Cyprus, Nicosia, Cyprus
Yannis Manolopoulos
Dalhousie University, Halifax, NS, Canada
Stan Matwin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Giarelis, N., Kanakaris, N., Karacapilidis, N. (2020). On the Utilization of Structural and Textual Information of a Scientific Knowledge Graph to Discover Future Research Collaborations: A Link Prediction Perspective. In: Appice, A., Tsoumakas, G., Manolopoulos, Y., Matwin, S. (eds) Discovery Science. DS 2020. Lecture Notes in Computer Science(), vol 12323. Springer, Cham. https://doi.org/10.1007/978-3-030-61527-7_29

Download citation

DOI: https://doi.org/10.1007/978-3-030-61527-7_29
Published: 15 October 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-61526-0
Online ISBN: 978-3-030-61527-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics