Abstract
Linked Data provides access to huge, continuously growing amounts of open data and ontologies in RDF format that describe entities, links and properties on those entities. Equipping Linked Data with inference paves the way to make the Semantic Web a reality. In this survey, we describe a unifying framework for RDF ontologies and databases that we call deductive RDF triplestores. It consists in equipping RDF triplestores with Datalog inference rules. This rule language allows to capture in a uniform manner OWL constraints that are useful in practice, such as property transitivity or symmetry, but also domain-specific rules with practical relevance for users in many domains of interest. The expressivity and the genericity of this framework is illustrated for modeling Linked Data applications and for developing inference algorithms. In particular, we show how it allows to model the problem of data linkage in Linked Data as a reasoning problem on possibly decentralized data. We also explain how it makes possible to efficiently extract expressive modules from Semantic Web ontologies and databases with formal guarantees, whilst effectively controlling their succinctness. Experiments conducted on real-world datasets have demonstrated the feasibility of this approach and its usefulness in practice for data integration and information extraction.
This work has been partially supported by the ANR projects Pagoda (12-JS02-007-01) and Qualinca (12-CORD-012), the joint NSFC-ANR Lindicle project (12-IS01-0002), and LabEx PERSYVAL-Lab (11-LABX-0025-01).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
We have slightly modified the INA vocabulary (e.g. translating French terms into English terms) for the sake of readability.
- 3.
- 4.
- 5.
- 6.
We only consider rules that conclude to sameAs statements because other rules can be handled with preprocessing by tools like Silk or LIMES.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
References
Abiteboul, S., Abrams, Z., Haar, S., Milo, T.: Diagnosis of asynchronous discrete event systems: datalog to the rescue! In: Proceedings of the Twenty-Fourth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, 13–15 June 2005, Baltimore, pp. 358–367. ACM (2005)
Abiteboul, S., Hull, R., Vianu, V.: Foundations of Databases. Addison-Wesley, Reading (1995)
Al-Bakri, M., Atencia, M., David, J., Lalande, S., Rousset, M.-C.: Uncertainty-sensitive reasoning for inferring sameAS facts in linked data. In: Proceedings of the European Conference on Artificial Intelligence (ECAI 2016), August 2016, The Hague (2016)
Al-Bakri, M., Atencia, M., Lalande, S., Rousset, M.-C.: Inferring same-as facts from linked data: an iterative import-by-query approach. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 25–30 January 2015, Austin, pp. 9–15. AAAI Press (2015)
Allemang, D., Hendler, J.: Semantic Web for the Working Ontologist: Modeling in RDF, RDFS and OWL. Morgan Kaufmann, San Francisco (2011)
Amarilli, A., Bourhis, P., Senellart, P.: Provenance circuits for trees and treelike instances. In: Halldórsson, M.M., Iwama, K., Kobayashi, N., Speckmann, B. (eds.) ICALP 2015. LNCS, vol. 9135, pp. 56–68. Springer, Heidelberg (2015). doi:10.1007/978-3-662-47666-6_5
Arasu, A., Ré, C., Suciu, D.: Large-scale deduplication with constraints using dedupalog. In: Proceedings of the 25th International Conference on Data Engineering, ICDE 2009, 29 March 2009–2 April 2009, Shanghai, pp. 952–963. IEEE Computer Society (2009)
Arenas, M., Gottlob, G., Pieris, A.: Expressive languages for querying the semantic web. In: Proceedings of the International Conference on Principles of Database Systems (PODS 2014) (2014)
Atencia, M., Al-Bakri, M., Rousset, M.-C.: Trust in networks of ontologies and alignments. J. Knowl. Inf. Syst. (2013). doi:10.1007/s10115-013-0708-9
Atencia, M., David, J., Euzenat, J.: Data interlinking through robust linkkey extraction. In: ECAI 2014 - 21st European Conference on Artificial Intelligence, 18–22 August 2014, Prague, - Including Prestigious Applications of Intelligent Systems (PAIS 2014). Frontiers in Artificial Intelligence and Applications, vol. 263, pp. 15–20. IOS Press (2014)
Atencia, M., David, J., Scharffe, F.: Keys and pseudo-keys detection for web datasets cleansing and interlinking. In: Teije, A., et al. (eds.) EKAW 2012. LNCS (LNAI), vol. 7603, pp. 144–153. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33876-2_14
Bröcheler, M., Mihalkova, L., Getoor, L.: Probabilistic similarity logic. In: Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence, UAI 2010, Catalina Island, 8–11 July 2010, pp. 73–82. AUAI Press (2010)
Calì, A., Gottlob, G., Lukasiewicz, T.: A general datalog-based framework for tractable query answering over ontologies. J. Web Semant. 14, 57–83 (2012)
Calvanese, D., De Giacomo, G., Lembo, D., Lenzerini, M., Rosati, R.: Tractable reasoning and efficient query answering in description logics: the DL-Lite family. J. Autom. Reason. 39(3), 385–429 (2007)
Chandra, A.K., Merlin, P.M.: Optimal implementation of conjunctive queries in relational databases. In: Proceedings of the 9th ACM Symposium on Theory of Computing, pp. 77–90 (1975)
Christen, P.: Data Matching - Concepts and Techniques for Record Linkage, Entity Resolution, and Duplicate Detection. Data-Centric Systems and Applications. Springer, Heidelberg (2012)
Dalvi, N., Suciu, D.: The dichotomy of probabilistic inference for unions of conjunctive queries. J. ACM 59(6), 17–37 (2012)
De Giacomo, G., Lenzerini, M., Rosati, R.: Higher-order description logics for domain metamodeling. In: Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence (AAAI-11) (2011)
Euzenat, J., Shvaiko, P.: Ontology Matching, 2nd edn. Springer, Heidelberg (2013)
Ferrara, A., Nikolov, A., Scharffe, F.: Data linking for the semantic web. Int. J. Semant. Web Inf. Syst. 7(3), 46–76 (2011)
Forgy, C.: Rete: a fast algorithm for the many patterns/many objects match problem. Artif. Intell. 19(1), 17–37 (1982)
Fuhr, N.: Probabilistic models in information retrieval. Comput. J. 3(35), 243–255 (1992)
Fuhr, N.: Probabilistic datalog: implementing logical information retrieval for advanced applications. J. Am. Soc. Inf. Sci. 51(2), 95–110 (2000)
Cuenca Grau, B., Horrocks, I., Kazakov, Y., Sattler, U.: Modular reuse of ontologies: theory and practice. J. Artif. Intell. Res. (JAIR-08) 31, 273–318 (2008)
Grau, B.C., Motik, B.: Reasoning over ontologies with hidden content: the import-by-query approach. J. Artif. Intell. Res. (JAIR) 45, 197–255 (2012)
Heath, T., Bizer, C.: Linked Data: Evolving the Web into a Global Data Space. Morgan and Claypool, Palo Alto (2011)
Herre, H.: General formal ontology (GFO): a foundational ontology for conceptual modelling. In: Poli, R., Healy, M., Healy, A. (eds.) Theory and Applications of Ontology, vol. 2, pp. 297–345. Springer, Berlin (2010)
Hillebrand, G.G., Kanellakis, P.C., Mairson, H.G., Vardi, M.Y.: Undecidable boundedness problems for datalog programs. J. Log. Program. (JLP-95) 25, 163–190 (1995)
Hinkelmann, K., Hintze, H.: Computing cost estimates for proof strategies. In: Dyckhoff, R. (ed.) ELP 1993. LNCS, vol. 798, pp. 152–170. Springer, Heidelberg (1994). doi:10.1007/3-540-58025-5_54
Hoehndorf, R., Ngonga Ngomo, A.-C., Kelso, J.: Applying the functional abnormality ontology pattern to anatomical functions. J. Biomed. Semant. 1(4), 1–15 (2010)
Hogan, A., Zimmermann, A., Umbrich, J., Polleres, A., Decker, S.: Scalable and distributed methods for entity matching, consolidation and disambiguation over linked data corpora. J. Web Semant. 10, 76–110 (2012)
Konev, B., Lutz, C., Walther, D., Wolter, F.: Semantic modularity and module extraction in description logics. In: Proceedings of the European Conference on Artificial Intelligence (ECAI-08) (2008)
Masolo, C., Borgo, S., Gangemi, A., Guarino, N., Oltramari, A., Schneider, L.: Wonder-web deliverable D17. The WonderWeb library of foundational ontologies and the DOLCE ontology. Technical report, ISTC-CNR (2002)
Ngonga Ngomo, A.-C., Auer, S.: LIMES - a time-efficient approach for large-scale link discovery on the web of data. In: Proceedings of the 22nd International Joint Conference on Artificial Intelligence, IJCAI 2011, Barcelona, 16–22 July 2011, pp. 2312–2317. IJCAI/AAAI (2011)
Noy, N.F., Musen, M.A.: Specifying ontology views by traversal. In: McIlraith, S.A., Plexousakis, D., Harmelen, F. (eds.) ISWC 2004. LNCS, vol. 3298, pp. 713–725. Springer, Heidelberg (2004). doi:10.1007/978-3-540-30475-3_49
Palombi, O., Ulliana, F., Favier, V., Rousset, M.-C.: My Corporis Fabrica: an ontology-based tool for reasoning and querying on complex anatomical models. J. Biomed. Semant. (JOBS 2014) 5, 20 (2014)
Rabattu, P.-Y., Masse, B., Ulliana, F., Rousset, M.-C., Rohmer, D., Leon, J.-C., Palombi, O.: My Corporis Fabrica embryo: an ontology-based 3D spatio-temporal modeling of human embryo development. J. Biomed. Semant. (JOBS 2015) 6, 36 (2015)
Rosse, C., Mejino, J.L.V.: A reference ontology for biomedical informatics: the foundational model of anatomy. J. Biomed. Inform. 36, 500 (2003)
Rousset, M.-C., Ulliana, F.: Extractiong bounded-level modules from deductive triplestores. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 25–30 January 2015, Austin. AAAI Press (2015)
Saïs, F., Pernelle, N., Rousset, M.-C.: Combining a logical and a numerical method for data reconciliation. J. Data Semant. 12, 66–94 (2009)
Singla, P., Domingos, P.M.: Entity resolution with Markov logic. In: Proceedings of the 6th IEEE International Conference on Data Mining (ICDM 2006), 18–22 December 2006, Hong Kong, pp. 572–582. IEEE Computer Society (2006)
Suchanek, F.M., Kasneci, G., Weikum, G.: Yago: a core of semantic knowledge. In: Proceedings of the World Wide Web Conference (WWW-07) (2007)
Suciu, D., Olteanu, D., Ré, C., Koch, C.: Probabilistic Databases. Morgan & Claypool, San Francisco (1995)
Symeonidou, D., Armant, V., Pernelle, N., Saïs, F.: SAKey: scalable almost key discovery in RDF data. In: Mika, P., et al. (eds.) ISWC 2014. LNCS, vol. 8796, pp. 33–49. Springer, Cham (2014). doi:10.1007/978-3-319-11964-9_3
Tournaire, R., Petit, J.-M., Rousset, M.-C., Termier, A.: Discovery of probabilistic mappings between taxonomies: principles and experiments. J. Data Semant. 15, 66–101 (2011)
Urbani, J., Harmelen, F., Schlobach, S., Bal, H.: QueryPIE: backward reasoning for OWL horst over very large knowledge bases. In: Aroyo, L., et al. (eds.) ISWC 2011. LNCS, vol. 7031, pp. 730–745. Springer, Heidelberg (2011). doi:10.1007/978-3-642-25073-6_46
Vieille, L.: Recursive axioms in deductive databases: the query/subquery approach. In: Expert Database Conference, pp. 253–267 (1986)
Volz, J., Bizer, C., Gaedke, M., Kobilarov, G.: Silk - a link discovery framework for the web of data. In: Proceedings of the WWW 2009 Workshop on Linked Data on the Web, LDOW 2009, Madrid, 20 April 2009, vol. 538. CEUR Workshop Proceedings. CEUR-WS.org (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this chapter
Cite this chapter
Rousset, MC., Atencia, M., David, J., Jouanot, F., Palombi, O., Ulliana, F. (2017). Datalog Revisited for Reasoning in Linked Data. In: Ianni, G., et al. Reasoning Web. Semantic Interoperability on the Web. Reasoning Web 2017. Lecture Notes in Computer Science(), vol 10370. Springer, Cham. https://doi.org/10.1007/978-3-319-61033-7_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-61033-7_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-61032-0
Online ISBN: 978-3-319-61033-7
eBook Packages: Computer ScienceComputer Science (R0)