Abstract
This article presents an observational study of the virtual graph formed by equivalence links between agent entities across 8 knowledge bases. To evaluate the potential of this linked data graph, we measured the equivalences that it could provide for a real dataset. We crawled the virtual graph by starting from references to agents we found in descriptions of objects collected from data of cultural heritage institutions in Europeana. Our study characterizes the current virtual equivalence graph, presenting statistics about the links, their type and origin. Crawling the equivalences for agent URIs required several crawling iterations on the virtual equivalence graph. The amount of gathered equivalences grows steeply in the first 3 crawling iterations and stabilizes on the 4th iteration. VIAF was the KB with the highest number of equivalences, reaching 60.7%, and it was followed by Wikidata with 34.5%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
For readability purposes, in this article we abbreviate namespaces as follows: owl for http://www.w3.org/2002/07/owl#; skos for http://www.w3.org/2004/02/skos/core#; schema for http://schema.org/; wdt for http://www.wikidata.org/prop/direct/.
- 9.
- 10.
We aim to provide insights that could be beneficial to providers and users of the original metadata, therefore, we have excluded the URIs used in automatic enrichment by Europeana (cf. https://pro.europeana.eu/page/europeana-semantic-enrichment).
- 11.
This goes beyond the actual formal semantics of these properties, but we wanted to experiment with it nonetheless, to get an upper bound of the level of benefit obtainable from the equivalences - and experience shows that the biggest data quality issues actually lie elsewhere.
- 12.
The list of Wikidata properties for external identifiers is available at https://www.wikidata.org/wiki/Special:ListProperties/external-id.
- 13.
The properties will contain an attribute wdt:P1921 (formatter URI for RDF resource).
- 14.
These 4 URIs are: http://data.bnf.fr/#foaf:Person; http://data.bnf.fr/#foaf:Organization; http://data.bnf.fr/#owl:Thing; and http://data.bnf.fr/#spatialThing. None correspond to an actual agent at BnF. We have mailed VIAF maintainers about it.
- 15.
For space reasons we cannot refer to all presentations and articles here. Some of them are accessible on the online documentation for the KB considered, given as earlier references.
References
Charles, V., Manguinhas, H., Isaac, A., Freire, N., Gordea, S.: Designing a multilingual knowledge graph as a service for cultural heritage – some challenges and solutions. In: International Conference on Dublin Core and Metadata Applications, 2018 (2018)
Beek, W., Raad, J., Wielemaker, J., van Harmelen, F.: sameAs.cc: the closure of 500 M owl:sameAs statements. In: Gangemi, A., et al. (eds.) ESWC 2018. LNCS, vol. 10843, pp. 65–80. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93417-4_5
Correndo, G., Penta, A., Gibbins, N., Shadbolt, N.: Statistical analysis of the, network for aligning concepts in the linking open data cloud. In: Liddle, S.W., Schewe, K.-D., Tjoa, A.M., Zhou, X. (eds.) DEXA 2012. LNCS, vol. 7447, pp. 215–230. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-32597-7_20
Halpin, H., Hayes, P.J., McCusker, J.P., McGuinness, D.L., Thompson, H.S.: When owl:sameAs isn’t the same: an analysis of identity in linked data. In: Patel-Schneider, P.F., et al. (eds.) ISWC 2010. LNCS, vol. 6496, pp. 305–320. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-17746-0_20
Ding, L., Shinavier, J., Shangguan, Z., McGuinness, Deborah L.: SameAs networks and beyond: analyzing deployment status and implications of owl:sameAs in linked data. In: Patel-Schneider, P.F., et al. (eds.) ISWC 2010. LNCS, vol. 6496, pp. 145–160. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-17746-0_10
Papaleo, L., Pernelle, N., Saïs, F., Dumont, C.: Logical detection of invalid SameAs statements in RDF data. In: Janowicz, K., Schlobach, S., Lambrix, P., Hyvönen, E. (eds.) EKAW 2014. LNCS (LNAI), vol. 8876, pp. 373–384. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-13704-9_29
Beek, W., Rietveld, L., Schlobach, S., van Harmelen, F.: LOD Laundromat: why the semantic web needs centralization (even if we don’t like it). IEEE Internet Comput. 20(2), 78–81 (2016)
Fernández, J.D., Beek, W., Martínez-Prieto, M.A., Arias, M.: LOD-a-lot. In: d’Amato, C., et al. (eds.) ISWC 2017. LNCS, vol. 10588, pp. 75–83. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68204-4_7
Rietveld, L.: Publishing and consuming linked data: optimizing for the unknown. In: Studies on the Semantic Web, vol. 21. IOS Press (2016)
Radulovic, F., Mihindukulasooriya, N., García-Castro, R., Gomez-Pérez, A.: A comprehensive quality model for linked data. Seman. Web 9(1), 3–24 (2018)
Asprino, L., Beek, W., Ciancarini, P., van Harmelen, F., Presutti, V.: Observing LOD using equivalent set graphs: it is mostly flat and sparsely linked. In: Ghidini, C., et al. (eds.) ISWC 2019. LNCS, vol. 11778, pp. 57–74. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30793-6_4
Freire, N., Isaac, A.: Technical usability of Wikidata’s linked data: evaluation of machine interoperability and data interpretability. In: Abramowicz, W., Paschke, A. (eds.) Lecture Notes in Business Information Processing. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-36691-9_47
Acknowledgments
This work was partly supported by Portuguese national funds through Fundação para a Ciência e a Tecnologia (FCT) with reference UIDB/50021/2020 and by the European Commission under contract number 30-CE-0885387/00-80.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Freire, N., Manguinhas, H., Isaac, A. (2020). An Observational Study of Equivalence Links in Cultural Heritage Linked Data for agents . In: Hall, M., Merčun, T., Risse, T., Duchateau, F. (eds) Digital Libraries for Open Knowledge. TPDL 2020. Lecture Notes in Computer Science(), vol 12246. Springer, Cham. https://doi.org/10.1007/978-3-030-54956-5_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-54956-5_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-54955-8
Online ISBN: 978-3-030-54956-5
eBook Packages: Computer ScienceComputer Science (R0)