Abstract
The aim of this paper is to solve the problem of disambiguation of authors’ names in scientific papers. In particular, it focuses on the problem of synonyms and homonyms. Thus, we often find two or more names written in different forms denoting the same person. Moreover, there may be several authors using the same name. To address both the synonym and homonym problems in scientific papers, we propose a framework that uses a hybrid approach of an ontological model and a deep learning model. First, we describe the design of the ontology model, the automatic ontology creation process, and the construction of a weighted co-author network through a set of semantic rules and queries. Second, the selected features are preprocessed during the attribute engineering process to measure the similarity indicator for each feature. Third, the similarity indicators are reduced to a vector space model and used as input to the Deep Learning-based author name disambiguation method to model different types of features. Fourth, the proposed framework is tested on smaller groups of the gold standard large dataset of scientific papers from several international databases named LAGOS-AND and achieves promising results compared to other similar solutions proposed in the literature.
This work is partially supported by Project 3 “ICT supporting the educational processes and the knowledge management in higher education (ELINF)” of the NETWORK University Cooperation “Strengthening of the role of ICT in Cuban Universities for the development of the society”. We thank Carlos Alberto Morell for his useful suggestions and ideas and the team of Li Zhang, Wei Lu and Jinqing Yang for providing the corpus used to train the Doc2Vec model of the gold standard dataset LAGOS-AND.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Shoaib, M., Daud, A., Amjad, T.: Author name disambiguation in bibliographic databases: a survey. arXiv prepre arXiv:2004.06391, pp. 1–24 (2020)
Wang, P., Zhao, J., Huang, K., Xu, B.: A unified semi-supervised framework for author disambiguation in academic social network. In: Decker, H., Lhotská, L., Link, S., Spies, M., Wagner, R.R. (eds.) Conference 2014, LNCS, vol. 8645, pp. 1–16. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10085-2_1
Hussain, I., Asghar, S.: A survey of author name disambiguation techniques: 2010–2016. Knowl. Eng. Rev. 32, 1–24 (2017). https://doi.org/10.1017/S0269888917000182
Ferreira, A.A., Gon¸calves, M.A., Laender, A.H.F.: Automatic disambiguation of author names in bibliographic repositories. In: Synthesis Lectures on Information Concepts, Retrieval, and Services, vol. 12 (1), pp. 1—146. Morgan & Claypool Publishers (2020). https://doi.org/10.2200/S01011ED1V01Y202005ICR070
Zhang, L., Lu, W., Yang, J.: LAGOS-AND: a large, gold standard dataset for scholarly author name disambiguation. arXiv prepre arXiv:2104.01821, pp. 1—27 (2021)
Fiannaca, A., La Rosa, M., Gaglio, S., Rizzo, R., Urso, A.: An ontological-based knowledge organization for bioinformatics workflow management system. EMBnet. J. 18(B), 110-–112 (2012). https://doi.org/10.14806/ej.18.B.570
Kurki, J., Hyvönen, E.: Authority control of people and organizations on the semantic web. In: Proceedings of the International Conferences on Digital Libraries and the Semantic Web 2009 (ICSD2009), September 2009, Trento, Italy, p. 15 (2009)
Pattuelli, M. C.: From uniform identifiers to graphs, from individuals to communities: what we talk about when we talk about linked person data. In: Challenges and Opportunities for Knowledge Organization in the Digital Age, pp. 571–580. Ergon-Verlag (2018). https://doi.org/10.5771/9783956504211-571
Kim, J.: Scale free collaboration networks: an author name disambiguation perspective. J. Assoc. Inf. Sci. Technol. 70(7), 685–700 (2019). https://doi.org/10.1002/asi.24158
Thenmozhi, D., Aravindan, C.: Ontology-based Tamil-English cross-lingual information retrieval system. Sadhana 43(157), 1–14 (2018). https://doi.org/10.1007/s12046-018-0942-7
Zaman, G., et al.: An ontological framework for information extraction from diverse scientific sources. IEEE Access 9, 42111–42124 (2021). https://doi.org/10.1109/ACCESS.2021.3063181
Hassell, J., Aleman-Meza, B., Arpinar, I.B.: Ontology-Driven Automatic Entity Disambiguation in Unstructured Text. In: Cruz, I., Decker, S., Allemang, D., Preist, C., Schwabe, D., Mika, P., Uschold, M., Aroyo, L.M. (eds.) ISWC 2006. LNCS, vol. 4273, pp. 44–57. Springer, Heidelberg (2006). https://doi.org/10.1007/11926078_4
Park, Y.-T., Kim, J.-M.: OnCU system: ontology-based category utility approach for author name disambiguation. In: 2nd International Conference on Ubiquitous Information Management and Communication Proceedings, pp. 63–68. New York, USA (2008). https://doi.org/10.1145/1352793.1352807
Lu, Z., Yan, Z., He, L.: OnPerDis: ontology-based personal name disambiguation on the web. In: 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) Proceedings, vol. 1, pp. 185–192. IEEE (2013).https://doi.org/10.1109/WI-IAT.2013.28
Kurakawa, K., et al.: Researcher Name Resolver: identifier management system for Japanese researchers. Int. J. Digit. Libr. 14(1–2), 39–58 (2014). https://doi.org/10.1007/s00799-014-0109-z
Han, H., Yao, C., Fu, Y., Yu, Y., Zhang, Y., Xu, S.: Semantic fingerprints-based author name disambiguation in Chinese documents. Scientometrics 111(3), 1879–1896 (2017). https://doi.org/10.1007/s11192-017-2338-6
Bravo, M., Reyes-Ortiz, J.A., Cruz, I.: Researcher profile ontology for academic environment. Book Sect. Adv. Intell. Syst. Comput. 943, 799–817 (2019). https://doi.org/10.1007/978-3-030-17795-960
Färber, M., Ao, L.: The microsoft academic knowledge graph enhanced: author name disambiguation, publication classification, and embeddings. Quantitative Sci. Stud. 3(1), 51–98 (2022). https://doi.org/10.1162/qss_a_00183
Santini, C., Gesese, G.A., Peroni, S., Gangemi, A., Sack, H., Alam, M.: A knowledge graph embeddings based approach for author name disambiguation using literals. Scientometrics 127(8), 4887–4912 (2022). https://doi.org/10.1007/s11192-022-04426-2
Gnoyke, P., Matta, K.: Author name disambiguation by clustering based on deep learned pairwise similarities, pp. 0—12, May (2020)
Firdaus, F., et al.: Author identification in bibliographic data using deep neural networks. TELKOMNIKA (Telecommun. Comput. Electron. Control) 19(3), pp. 911–919 (2021). https://doi.org/10.12928/telkomnika.v19i3.18877
Ahmedi, L., Abazi-Bexheti, L., Kadriu, A.: A uniform semantic web framework for co-authorship networks. In: IEEE Ninth International Conference on Dependable, Autonomic and Secure Computing Proceedings, no. 2, pp. 958–965 (2011). https://doi.org/10.1109/DASC.2011.159
Gómez-Pérez, A., Suárez-Figueroa, M.C.: NeOn methodology for building ontology networks: a scenario-based methodology (2009)
Suárez-Figueroa, M.C., Gómez-Pérez, A., Mariano, F.-L.: The NeOn methodology framework: a scenario-based methodology for ontology development. Appl. Ontol. 10(2), 107–145 (2015). https://doi.org/10.3233/AO-150145
Leiva-Mederos, A., García-Duarte, D., Gálvez-Lio, D., Hidalgo-Delgado, Y., Senso-Ruíz, J.S: An ontological model for the failure detection in power electric systems. In: Iberoamerican Knowledge Graphs and Semantic Web Conference Proceedings, pp. 130–146 (2020). https://doi.org/10.1007/978-3-030-65384-2
Díaz-de-la-Paz, L., Riestra-Collado, F. N., García-Mendoza, J. L., GonzálezGonzalez, L. M., Leiva-Mederos, A. A., Taboada-Crispi, A.: Weights estimation in the completeness measurement of bibliographic metadata. Comput. Sist. 25(1), 117–128 (2021). https://doi.org/10.13053/cys-25-1-3355
Le, Q. V., Mikolov, T.: Distributed representations of sentences and documents. In: International Conference on Machine Learning Proceedings, arXiv Prepr. arXiv:1405.4053, vol. 32 (2), pp. 1188–1196 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Díaz-de-la-Paz, L., Concepción-Pérez, L., Portal-Díaz, J.A., Taboada-Crispi, A., Leiva-Mederos, A.A. (2022). Framework for Author Name Disambiguation in Scientific Papers Using an Ontological Approach and Deep Learning. In: Villazón-Terrazas, B., Ortiz-Rodriguez, F., Tiwari, S., Sicilia, MA., Martín-Moncunill, D. (eds) Knowledge Graphs and Semantic Web . KGSWC 2022. Communications in Computer and Information Science, vol 1686. Springer, Cham. https://doi.org/10.1007/978-3-031-21422-6_16
Download citation
DOI: https://doi.org/10.1007/978-3-031-21422-6_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-21421-9
Online ISBN: 978-3-031-21422-6
eBook Packages: Computer ScienceComputer Science (R0)