Abstract
It is not an easy task for a data owner to publish a dataset as Linked Data with connections to existing datasets since there are too many datasets, thus it is hard to find the related ones, to download them and to check their content (let alone to apply entity matching over them). However, the connections with other datasets are important for discoverability, browsing, and querying in general. To alleviate this problem in this paper we introduce LODChain, a service that can help a provider to strengthen the connections between his/her dataset and the rest of datasets. LODChain finds the common entities, schema elements and triples among the dataset at hand and hundreds of LOD Datasets and through equivalence reasoning it suggests to the user various inferred connections, as well as related datasets. In addition, it detects erroneous mappings, and offers various content-based dataset discovery services, for enabling the enrichment of datasets’ content. The key difference with the existing approaches is that they are metadata-based, while what we propose is data-based. We present an implementation of LODChain, and we report various experimental results over real and synthetic datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Alexander, K., Cyganiak, R., Hausenblas, M., Zhao, J.: Describing linked datasets with the VoID vocabulary (2011)
Asprino, L., Beek, W., Ciancarini, P., Harmelen, F.V., Presutti, V.: Observing LOD using equivalent set graphs: it is mostly flat and sparsely linked. In: International Semantic Web Conference, pp. 57–74. Springer (2019). https://doi.org/10.1007/978-3-030-30793-6_4
Beek, W., Raad, J., Acar, E., van Harmelen, F.: MetaLink: a travel guide to the LOD cloud. In: Harth, A., et al. (eds.) ESWC 2020. LNCS, vol. 12123, pp. 481–496. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-49461-2_28
Bischof, S., Harth, A., Kämpgen, B., Polleres, A., Schneider, P.: Enriching integrated statistical open city data by combining equational knowledge and missing value imputation. J. Web Semant. 48, 22–47 (2018)
Bizer, C., Heath, T., Berners-Lee, T.: Linked data: the story so far. In: Semantic Services, Interoperability and Web Applications: Emerging Concepts, pp. 205–227. IGI global (2011)
Brickley, D., Burgess, M., Noy, N.: Google dataset search: building a search engine for datasets in an open web ecosystem. In: The World Wide Web Conference, pp. 1365–1375 (2019)
Chapman, A., et al.: Dataset search: a survey. VLDB J. 29(1), 251–272 (2019). https://doi.org/10.1007/s00778-019-00564-x
Christophides, V., Efthymiou, V., Palpanas, T., Papadakis, G., Stefanidis, K.: An overview of end-to-end entity resolution for big data. ACM Comput. Surv. (CSUR) 53(6), 1–42 (2020)
Cox, S.J.D., Richard, S.M.: A geologic timescale ontology and service. Earth Sci. Inf. 8(1), 5–19 (2014). https://doi.org/10.1007/s12145-014-0170-6
Debattista, J., Attard, J., Brennan, R., O’Sullivan, D.: Is the LOD cloud at risk of becoming a museum for datasets? Looking ahead towards a fully collaborative and sustainable LOD cloud. In: Proceedings of WWW Conference, pp. 850–858 (2019)
Fernández, J.D., Beek, W., Martínez-Prieto, M.A., Arias, M.: LOD-a-lot. In: International Semantic Web Conference, pp. 75–83. Springer (2017). https://doi.org/10.1007/978-3-319-68204-4_7
Gottron, T., Scherp, A., Krayer, B., Peters, A.: LODatio: a schema-based retrieval system for linked open data at web-scale. In: Extended Semantic Web Conference, pp. 142–146. Springer (2013). https://doi.org/10.1007/978-3-642-41242-4_13
GRNET: Okeanos cloud computing service. https://okeanos.grnet.gr. Accessed 25 July 2022
Hubauer, T., Lamparter, S., Haase, P., Herzig, D.M.: Use cases of the industrial knowledge graph at siemens. In: International Semantic Web Conference (P &D/Industry/BlueSky) (2018)
Kotis, K., Angelis, S., Chondrogianni, M., Marini, E.: Children’s art museum collections as linked open data. Int. J. Metadata Semant. Ontol. 15(1), 60–70 (2021)
Lehmann, J., et al.: Dpedia-a large-scale, multilingual knowledge base extracted from Wikipedia. Semant. Web 6(2), 167–195 (2015)
Weigl, D.M., et al.: Interweaving and enriching digital music collections for scholarship, performance, and enjoyment. In: 6th International Conference on Digital Libraries for Musicology, pp. 84–88 (2019)
Mäkelä, E., Törnroos, J., Lindquist, T., Hyvönen, E.: WW1LOD: an application of CIDOC-CRM to world war 1 linked data. IJDL 18(4), 333–343 (2017)
McCrae, J.P., et al.: The linked open data cloud. Lod-cloud. net (2019)
Mountantonakis, M.: Services for Connecting and Integrating Big Numbers of Linked Datasets, vol. 50. IOS Press (2021)
Mountantonakis, M., et al.: Extending VoID for expressing connectivity metrics of a semantic warehouse. In: PROFILES@ ESWC (2014)
Mountantonakis, M., Tzitzikas, Y.: On measuring the lattice of commonalities among several linked datasets. Proc. VLDB 9(12), 1101–1112 (2016)
Mountantonakis, M., Tzitzikas, Y.: Scalable methods for measuring the connectivity and quality of large numbers of linked datasets. J. Data Inf. Qual. (JDIQ) 9(3), 1–49 (2018)
Mountantonakis, M., Tzitzikas, Y.: Large-scale semantic integration of linked data: a survey. CSUR 52(5), 1–40 (2019)
Mountantonakis, M., Tzitzikas, Y.: Content-based union and complement metrics for dataset search over RDF knowledge graphs. ACM JDIQ 12(2), 1–31 (2020)
Mountantonakis, M., Tzitzikas, Y.: How your cultural dataset is connected to the rest linked open data. In: Proceedings of the TMM-CH2021, Communications in Computer and Information Science, Athens, Greece, pp. 12–15 (2021)
Mountantonakis, M., Tzitzikas, Y.: LODChain, April 2022. https://doi.org/10.5281/zenodo.6467419
Nayak, A., Božić, B., Longo, L.: Linked data quality assessment: a survey. In: International Conference on Web Services, pp. 63–76. Springer (2021). https://doi.org/10.1007/978-3-030-96140-4_5
Nečaskỳ, M., Škoda, P., Bernhauer, D., Klímek, J., Skopal, T.: Modular framework for similarity-based dataset discovery using external knowledge. Data Technol. Appl. 56(4), 506–535 (2022)
Otero-Cerdeira, L., et al.: Ontology matching: a literature review. Expert Syst. Appl. 42(2), 949–971 (2015)
Paris, P.-H.: Assessing the quality of owl:sameAs links. In: Gangemi, A., et al. (eds.) ESWC 2018. LNCS, vol. 11155, pp. 304–313. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98192-5_49
Pietriga, E., et al.: Browsing linked data catalogs with LODAtlas. In: International Semantic Web Conference, pp. 137–153. Springer (2018). https://doi.org/10.1007/978-3-030-00668-6_9
Rebele, T., Suchanek, F., Hoffart, J., Biega, J., Kuzey, E., Weikum, G.: YAGO: a multilingual knowledge base from Wikipedia, Wordnet, and Geonames. In: Groth, P., et al. (eds.) ISWC 2016. LNCS, vol. 9982, pp. 177–185. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46547-0_19
Rietveld, L., Beek, W., Schlobach, S.: LOD Lab: experiments at LOD scale. In: Arenas, M., et al. (eds.) ISWC 2015. LNCS, vol. 9367, pp. 339–355. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25010-6_23
Sabou, M., Onder, I., Brasoveanu, A.M.P., Scharl, A.: Towards cross-domain data analytics in tourism: a linked data based approach. Inf. Technol. Tour. 16(1), 71–101 (2016). https://doi.org/10.1007/s40558-015-0049-5
Sierman, B., Teszelszky, K.: How can we improve our web collection? An evaluation of webarchiving at the KB national library of the Netherlands (2007–2017). Alexandria 27(2), 94–107 (2017)
Tzitzikas, Y., et al.: Methods and tools for supporting the integration of stocks and fisheries. In: International Conference on Information and Communication Technologies in Agriculture, Food & Environment, pp. 20–34. Springer (2017). https://doi.org/10.1007/978-3-030-12998-9_2
Umbrich, J., Hogan, A., Polleres, A., Decker, S.: Link traversal querying for a diverse web of data. Semant. Web 6(6), 585–624 (2015)
Valdestilhas, A., Soru, T., Nentwig, M., Marx, E., Saleem, M., Ngomo, A.-C.N.: Where is My URI? In: Gangemi, A., et al. (eds.) ESWC 2018. LNCS, vol. 10843, pp. 671–681. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-93417-4_43
Valdestilhas, A., Soru, T., Ngomo, A.C.N.: CEDAL: time-efficient detection of erroneous links in large-scale link repositories. In: Proceedings of the International Conference on Web Intelligence, pp. 106–113 (2017)
Vandenbussche, P.Y., Umbrich, J., Matteis, L., Hogan, A., Buil-Aranda, C.: SPARQLES: monitoring public SPARQL endpoints. Semant. Web 8(6), 1049–1065 (2017)
Volz, J., Bizer, C., Gaedke, M., Kobilarov, G.: Silk-a link discovery framework for the web of data. In: LDOW (2009)
Vrandečić, D., Krötzsch, M.: Wikidata: a free collaborative knowledge base. Commun. ACM 57(10), 78–85 (2014)
Wang, X., Cheng, G., Pan, J.Z., Kharlamov, E., Qu, Y.: BANDAR: benchmarking snippet generation algorithms for (RDF) dataset search. IEEE Trans. Knowl. Data Eng. (2021). https://ieeexplore.ieee.org/document/9477056
Wiśniewski, D., Potoniec, J., Ławrynowicz, A., Keet, C.M.: Analysis of ontology competency questions and their formalizations in SPARQL-OWL. J. Web Semant. 59, 100534 (2019)
Yochum, P., Chang, L., Gu, T., Zhu, M.: Linked open data in location-based recommendation system on tourism domain: a survey. IEEE Access 8, 16409–16439 (2020)
Yumusak, S., Dogdu, E., Kodaz, H., Kamilaris, A., Vandenbussche, P.Y.: SpEnD: linked data SPARQL endpoints discovery using search engines. IEICE Trans. Inf. Syst. 100(4), 758–767 (2017)
Zaveri, A., Rula, A., Maurino, A., Pietrobon, R., Lehmann, J., Auer, S.: Quality assessment for linked data: a survey. Semant. Web 7(1), 63–93 (2016)
Acknowledgments
This work has received funding from the European Union’s Horizon 2020 coordination and support action 4CH (Grant agreement No 101004468).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Mountantonakis, M., Tzitzikas, Y. (2022). LODChain: Strengthen the Connectivity of Your RDF Dataset to the Rest LOD Cloud. In: Sattler, U., et al. The Semantic Web – ISWC 2022. ISWC 2022. Lecture Notes in Computer Science, vol 13489. Springer, Cham. https://doi.org/10.1007/978-3-031-19433-7_31
Download citation
DOI: https://doi.org/10.1007/978-3-031-19433-7_31
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-19432-0
Online ISBN: 978-3-031-19433-7
eBook Packages: Computer ScienceComputer Science (R0)