LiterallyWikidata - A Benchmark for Knowledge Graph Completion Using Literals

Gesese, Genet Asefa; Alam, Mehwish; Sack, Harald

doi:10.1007/978-3-030-88361-4_30

LiterallyWikidata - A Benchmark for Knowledge Graph Completion Using Literals

Genet Asefa Gesese^17,18,
Mehwish Alam^17,18 &
Harald Sack^17,18

Conference paper
First Online: 30 September 2021

3362 Accesses
4 Citations
6 Altmetric

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12922))

Abstract

In order to transform a Knowledge Graph (KG) into a low dimensional vector space, it is beneficial to preserve as much semantics as possible from the different components of the KG. Hence, some link prediction approaches have been proposed so far which leverage literals in addition to the commonly used links between entities. However, the procedures followed to create the existing datasets do not pay attention to literals. Therefore, this study presents a set of KG completion benchmark datasets extracted from Wikidata and Wikipedia, named LiterallyWikidata. It has been prepared with the main focus on providing benchmark datasets for multimodal KG Embedding (KGE) models, specifically for models using numeric and/or text literals. Hence, the benchmark is novel as compared to the existing datasets in terms of properly handling literals for those multimodal KGE models. LiterallyWikidata contains three datasets which vary both in size and structure. Benchmarking experiments on the task of link prediction have been conducted on LiterallyWikidata with extensively tuned unimodal/multimodal KGE models. The datasets are available at https://doi.org/10.5281/zenodo.4701190.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
The details including the DOI are given under the reference [14].
2.
https://www.themoviedb.org/.
3.
https://dumps.wikimedia.org/wikidatawiki/.
4.
http://www.opengis.net/ont/geosparql#.
5.
http://www.w3.org/2001/XMLSchema#.
6.
https://pykeen.readthedocs.io/en/latest/.
7.
https://github.com/GenetAsefa/LiterallyWikidata.

References

Akrami, F., Saeef, M.S., Zhang, Q., Hu, W., Li, C.: Realistic re-evaluation of knowledge graph completion methods: An experimental study. In: Proceedings of the ACM SIGMOD International Conference on Management of Data (2020)
Google Scholar
Ali, M., et al.: Bringing light into the dark: a large-scale evaluation of knowledge graph embedding models under a unified framework. arXiv preprint arXiv:2006.13365 (2020)
Batagelj, V., Zaveršnik, M.: Fast algorithms for determining (generalized) core groups in social networks. Adv. Data Anal. Classif. 5(2), 129–145 (2011)
Article MathSciNet Google Scholar
van Berkel, L., de Boer, V.: kgbench: A collection of knowledge graph datasets for evaluating relational and multimodal machine learning. In: ESWC (2021)
Google Scholar
Bollacker, K., Evans, C., Paritosh, P., Sturge, T., Taylor, J.: Freebase: a collaboratively created graph database for structuring human knowledge. In: Proceedings of the ACM SIGMOD international conference on Management of data (2008)
Google Scholar
Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., Yakhnenko, O.: Translating embeddings for modeling multi-relational data. In: NIPS (2013)
Google Scholar
Bouchard, G., Singh, S., Trouillon, T.: On approximate reasoning capabilities of low-rank vector spaces. In: AAAI Spring Symposia (2015)
Google Scholar
Carlson, A., Betteridge, J., Kisiel, B., Settles, B., Hruschka, E.R., Mitchell, T.M.: Toward an architecture for never-ending language learning. In: Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence. AAAI Press (2010)
Google Scholar
Daza, D., Cochez, M., Groth, P.: Inductive entity representations from text via link prediction. In: Proceedings of the Web Conference 2021, pp. 798–808 (2021)
Google Scholar
Dettmers, T., Minervini, P., Stenetorp, P., Riedel, S.: Convolutional 2d knowledge graph embeddings. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)
Google Scholar
García-Durán, A., Bordes, A., Usunier, N.: Effective blending of two and three-way interactions for modeling multi-relational data. In: Calders, T., Esposito, F., Hüllermeier, E., Meo, R. (eds.) ECML PKDD 2014. LNCS (LNAI), vol. 8724, pp. 434–449. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-44848-9_28
Chapter Google Scholar
García-Durán, A., Bordes, A., Usunier, N.: Composing relationships with translations. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 286–290. Association for Computational Linguistics (2015)
Google Scholar
García-Durán, A., Niepert, M.: KBLRN: End-to-end learning of knowledge base representations with latent, relational, and numerical features. In: Globerson, A., Silva, R. (eds.) Proceedings of the Thirty-Fourth Conference on Uncertainty in Artificial Intelligence, pp. 372–381. AUAI Press (2018)
Google Scholar
Gesese, G.A., Alam, M., Sack, H.: LiterallyWikidata - A Benchmark for Knowledge Graph Completion using Literals April 2021. https://doi.org/10.5281/zenodo.4701190
Gesese, G.A., Biswas, R., Alam, M., Sack, H.: A survey on knowledge graph embeddings with literals: Which model links better literal-ly?. arXiv preprint arXiv:1910.12507 (2019)
Guo, S., Wang, Q., Wang, L., Wang, B., Guo, L.: Knowledge graph embedding with iterative guidance from soft rules. In: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence (2018)
Google Scholar
Harper, F.M., Konstan, J.A.: The movielens datasets: history and context. ACM Trans. Interact. Intell. Syst. 5(4), 1–19 (2015)
Article Google Scholar
Hinton, G.E., et al.: Learning distributed representations of concepts. In: Proceedings of the Eighth Annual Conference of the Cognitive Science Society, vol. 1, p. 12. Amherst (1986)
Google Scholar
Kemp, C., Tenenbaum, J.B., Griffiths, T.L., Yamada, T., Ueda, N.: Learning systems of concepts with an infinite relational model. In: Proceedings of the 21st National Conference on Artificial Intelligence, vol. 1, pp. 381–388. AAAI 2006, AAAI Press (2006)
Google Scholar
Kok, S., Domingos, P.: Statistical predicate invention. In: Proceedings of the 24th International Conference on Machine Learning. Association for Computing Machinery (2007)
Google Scholar
Kristiadi, A., Khan, M.A., Lukovnikov, D., Lehmann, J., Fischer, A.: Incorporating literals into knowledge graph embeddings. In: Ghidini, C., et al. (eds.) ISWC 2019. LNCS, vol. 11778, pp. 347–363. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30793-6_20
Chapter Google Scholar
Lin, Y., Liu, Z., Sun, M.: Knowledge representation learning with entities, attributes and relations. In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence IJCAI 2016, pp. 2866–2872. AAAI Press (2016)
Google Scholar
Mahdisoltani, F., Biega, J., Suchanek, F.M.: Yago3: A knowledge base from multilingual wikipedias. In: CIDR (2015)
Google Scholar
McCray, A.: An upper-level ontology for the biomedical domain. Comp. Funct. Genomics 4, 80–84 (2003)
Article Google Scholar
Miller, G.A.: Wordnet: a lexical database for english. Commun. ACM 38, 39–41 (1995)
Article Google Scholar
Mitchell, T., et al.: Never-ending learning. Commun. ACM 61(5), 103–115 (2018)
Article Google Scholar
Pezeshkpour, P., Chen, L., Singh, S.: Embedding multimodal relational data for knowledge base completion. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 3208–3218. Association for Computational Linguistics October-November 2018
Google Scholar
Rummel, R.J.: Dimensionality of nations project: Attributes of nations and behavior of nation dyads, pp. 1950–1965, 16 February 1992
Google Scholar
Safavi, T., Koutra, D.: CoDEx: A comprehensive knowledge graph completion benchmark. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), November 2020
Google Scholar
Safavi, T., Koutra, D., Meij, E.: Improving the utility of knowledge graph embeddings with calibration. arXiv preprint arXiv:2004.01168 (2020)
Shah, H., Villmow, J., Ulges, A., Schwanecke, U., Shafait, F.: An open-world extension to knowledge graph completion models. In: AAAI (2019)
Google Scholar
Socher, R., Chen, D., Manning, C.D., Ng, A.Y.: Reasoning with neural tensor networks for knowledge base completion. In: Proceedings of the 26th International Conference on Neural Information Processing Systems, vol. 1 (2013)
Google Scholar
Suchanek, F.M., Kasneci, G., Weikum, G.: Yago: A core of semantic knowledge. In: 16th International Conference on the World Wide Web, pp. 697–706 (2007)
Google Scholar
Sun, Z., Deng, Z.H., Nie, J.Y., Tang, J.: Rotate: knowledge graph embedding by relational rotation in complex space. In: International Conference on Learning Representations (2019)
Google Scholar
Tay, Y., Tuan, L.A., Phan, M.C., Hui, S.C.: Multi-task neural network for non-discrete attribute prediction in knowledge graphs. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. pp. 1029–1038. Association for Computing Machinery (2017)
Google Scholar
Toutanova, K., Chen, D.: Observed versus latent features for knowledge base and text inference. In: Proceedings of the 3rd Workshop on Continuous Vector Space Models and their Compositionality (2015)
Google Scholar
Vrandečić, D., Krötzsch, M.: Wikidata: a free collaborative knowledgebase. Commun. ACM 57(10), 78–85 (2014)
Article Google Scholar
Wang, Q., Mao, Z., Wang, B., Guo, L.: Knowledge graph embedding: a survey of approaches and applications. IEEE Trans. Knowl. Data Eng. 29(12), 2724–2743 (2017)
Article Google Scholar
Wang, Z., Zhang, J., Feng, J., Chen, Z.: Knowledge graph embedding by translating on hyperplanes. In: Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence AAAI 2014, pp. 1112–1119. AAAI Press (2014)
Google Scholar
Wu, Y., Wang, Z.: Knowledge graph embedding with numeric attributes of entities. In: Proceedings of The Third Workshop on Representation Learning for NLP, pp. 132–136. Association for Computational Linguistics (2018)
Google Scholar
Xie, R., Liu, Z., Jia, J., Luan, H., Sun, M.: Representation learning of knowledge graphs with entity descriptions. In: AAAI (2016)
Google Scholar
Xiong, W., Hoang, T., Wang, W.Y.: DeepPath: A reinforcement learning method for knowledge graph reasoning. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP) (2017)
Google Scholar
Yang, B., Yih, W.t., He, X., Gao, J., Deng, L.: Embedding entities and relations for learning and inference in knowledge bases. In: International Conference on Learning Representations (ICLR) (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

FIZ Karlsruhe – Leibniz Institute for Information Infrastructure, Eggenstein-Leopoldshafen, Germany
Genet Asefa Gesese, Mehwish Alam & Harald Sack
Karlsruhe Institute of Technology, Institute AIFB, Karlsruhe, Germany
Genet Asefa Gesese, Mehwish Alam & Harald Sack

Authors

Genet Asefa Gesese
View author publications
You can also search for this author in PubMed Google Scholar
Mehwish Alam
View author publications
You can also search for this author in PubMed Google Scholar
Harald Sack
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Genet Asefa Gesese , Mehwish Alam or Harald Sack .

Editor information

Editors and Affiliations

University of Würzburg, Würzburg, Germany
Andreas Hotho
Linköping University, Linköping, Sweden
Eva Blomqvist
University of Düsseldorf, Düsseldorf, Germany
Stefan Dietze
IBM Research - Thomas J. Watson Research, Hawthorne, CA, USA
Achille Fokoue
University of Texas, Austin, TX, USA
Ying Ding
Imperial College, London, UK
Payam Barnaghi
Australian National University, Canberra, ACT, Australia
Armin Haller
Fondazione Bruno Kessler, Povo, Trento, Italy
Mauro Dragoni
The Open University Walton Hall, Milton Keynes, UK
Harith Alani

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gesese, G.A., Alam, M., Sack, H. (2021). LiterallyWikidata - A Benchmark for Knowledge Graph Completion Using Literals. In: Hotho, A., et al. The Semantic Web – ISWC 2021. ISWC 2021. Lecture Notes in Computer Science(), vol 12922. Springer, Cham. https://doi.org/10.1007/978-3-030-88361-4_30

Download citation

DOI: https://doi.org/10.1007/978-3-030-88361-4_30
Published: 30 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-88360-7
Online ISBN: 978-3-030-88361-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the Semantic Web Science Association (opens in a new tab)