Clustering Semantic Predicates in the Open Research Knowledge Graph

Arab Oghli, Omar; D’Souza, Jennifer; Auer, Sören

doi:10.1007/978-3-031-21756-2_39

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13636))

Included in the following conference series:

International Conference on Asian Digital Libraries

Abstract

When semantically describing knowledge graphs (KGs), users have to make a critical choice of a vocabulary (i.e. predicates and resources). The success of KG building is determined by the convergence of shared vocabularies so that meaning can be established. The typical lifecycle for a new KG construction can be defined as follows: nascent phases of graph construction experience terminology divergence, while later phases of graph construction experience terminology convergence and reuse. In this paper, we describe our approach tailoring two AI-based clustering algorithms for recommending predicates (in RDF statements) about resources in the Open Research Knowledge Graph (ORKG) https://orkg.org/. Such a service to recommend existing predicates to semantify new incoming data of scholarly publications is of paramount importance for fostering terminology convergence in the ORKG.

Our experiments show very promising results: a high precision with relatively high recall in linear runtime performance. Furthermore, this work offers novel insights into the predicate groups that automatically accrue loosely as generic semantification patterns for semantification of scholarly knowledge spanning 44 research fields.

Supported by TIB Leibniz Information Centre for Science and Technology, the EU H2020 ERC project ScienceGRaph (GA ID: 819536).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Scholarly Knowledge Graph Construction from Published Software Packages

Exploiting lexical patterns for knowledge graph construction from unstructured text in Spanish

Article Open access 25 August 2022

ISSA: Generic Pipeline, Knowledge Model and Visualization Tools to Help Scientists Search and Make Sense of a Scientific Archive

Notes

1.
https://orkg.org/orkg/api/rdf/dump.

References

Anteghini, M., D’Souza, J., dos Santos, V.A.P.M., Auer, S.: Easy semantification of bioassays (2021). https://arxiv.org/abs/2111.15182
Aryani, A., et al.: A research graph dataset for connecting research data repositories using RD-switchboard. Sci. Data 5, 180099 (2018)
Article Google Scholar
Auer, S., et al.: Improving access to scientific literature with knowledge graphs. Bibliothek Forschung und Praxis 44(3), 516–529 (2020)
Article Google Scholar
Baas, J., Schotten, M., Plume, A., Côté, G., Karimi, R.: Scopus as a curated, high-quality bibliometric data source for academic research in quantitative science studies. Quant. Sci. Stud. 1(1), 377–386 (2020)
Article Google Scholar
Beltagy, I., Lo, K., Cohan, A.: SciBERT: a pretrained language model for scientific text. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 3606–3611 (2019)
Google Scholar
Birkle, C., Pendlebury, D.A., Schnell, J., Adams, J.: Web of science as a data source for research on scientific and scholarly activity. Quant. Sci. Stud. 1(1), 363–376 (2020)
Article Google Scholar
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Google Scholar
Buitinck, L., et al.: API design for machine learning software: experiences from the scikit-learn project. In: ECML PKDD Workshop: Languages for Data Mining and Machine Learning, pp. 108–122 (2013)
Google Scholar
Dessì, D., Osborne, F., Reforgiato Recupero, D., Buscaldi, D., Motta, E., Sack, H.: AI-KG: an automatically generated knowledge graph of artificial intelligence. In: Pan, J.Z., et al. (eds.) ISWC 2020. LNCS, vol. 12507, pp. 127–143. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-62466-8_9
Chapter Google Scholar
Fricke, S.: Semantic scholar. J. Med. Libr. Assoc. JMLA 106(1), 145 (2018)
Google Scholar
Jin, X., Han, J.: K-means clustering. In: Sammut, C., Webb, G.I. (eds.) Encyclopedia of Machine Learning and Data Mining, pp. 695–697. Springer, Boston (2017). https://doi.org/10.1007/978-1-4899-7687-1_431
Kabongo, S., D’Souza, J., Auer, S.: Automated mining of leaderboards for empirical AI research. In: Ke, H.-R., Lee, C.S., Sugiyama, K. (eds.) ICADL 2021. LNCS, vol. 13133, pp. 453–470. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-91669-5_35
Chapter Google Scholar
Manghi, P., et al.: OpenAIRE research graph dump, December 2019. https://doi.org/10.5281/zenodo.3516918
Oelen, A., et al.: Covid-19 reproductive number estimates (2020). https://doi.org/10.48366/R44930. https://www.orkg.org/orkg/comparison/R44930
Oelen, A., Jaradeh, M.Y., Stocker, M., Auer, S.: Generate FAIR literature surveys with scholarly knowledge graphs, pp. 97–106. Association for Computing Machinery, New York (2020). https://doi.org/10.1145/3383583.3398520
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet MATH Google Scholar
Sammut, C., Webb, G.I. (eds.): TF-IDF. In: Sammut, C., Webb, G.I. (eds.) Encyclopedia of Machine Learning, pp. 986–987. Springer, Boston (2010). https://doi.org/10.1007/978-0-387-30164-8_832
Wang, K., Shen, Z., Huang, C., Wu, C.H., Dong, Y., Kanakia, A.: Microsoft academic graph: when experts are not enough. Quant. Sci. Stud. 1(1), 396–413 (2020)
Article Google Scholar
Zepeda-Mendoza, M.L., Resendis-Antonio, O.: Hierarchical agglomerative clustering. In: Dubitzky, W., Wolkenhauer, O., Cho, KH., Yokota, H. (eds.) Encyclopedia of Systems Biology, pp. 886–887. Springer, New York (2013). https://doi.org/10.1007/978-1-4419-9863-7_1371

Download references

Author information

Authors and Affiliations

TIB Leibniz Information Centre for Science and Technology, Hannover, Germany
Omar Arab Oghli, Jennifer D’Souza & Sören Auer

Authors

Omar Arab Oghli
View author publications
You can also search for this author in PubMed Google Scholar
Jennifer D’Souza
View author publications
You can also search for this author in PubMed Google Scholar
Sören Auer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Omar Arab Oghli .

Editor information

Editors and Affiliations

National Taiwan Normal University, Taipei, Taiwan
Yuen-Hsien Tseng
Doshisha University, Kyoto, Japan
Marie Katsurai
VNU University of Engineering and Technology, Hanoi, Vietnam
Hoa N. Nguyen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Arab Oghli, O., D’Souza, J., Auer, S. (2022). Clustering Semantic Predicates in the Open Research Knowledge Graph. In: Tseng, YH., Katsurai, M., Nguyen, H.N. (eds) From Born-Physical to Born-Virtual: Augmenting Intelligence in Digital Libraries. ICADL 2022. Lecture Notes in Computer Science, vol 13636. Springer, Cham. https://doi.org/10.1007/978-3-031-21756-2_39

Download citation

DOI: https://doi.org/10.1007/978-3-031-21756-2_39
Published: 07 December 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-21755-5
Online ISBN: 978-3-031-21756-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Clustering Semantic Predicates in the Open Research Knowledge Graph

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Scholarly Knowledge Graph Construction from Published Software Packages

Exploiting lexical patterns for knowledge graph construction from unstructured text in Spanish

ISSA: Generic Pipeline, Knowledge Model and Visualization Tools to Help Scientists Search and Make Sense of a Scientific Archive

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Clustering Semantic Predicates in the Open Research Knowledge Graph

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Scholarly Knowledge Graph Construction from Published Software Packages

Exploiting lexical patterns for knowledge graph construction from unstructured text in Spanish

ISSA: Generic Pipeline, Knowledge Model and Visualization Tools to Help Scientists Search and Make Sense of a Scientific Archive

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation