Skip to main content

Datalog Revisited for Reasoning in Linked Data

  • Chapter
  • First Online:
Reasoning Web. Semantic Interoperability on the Web (Reasoning Web 2017)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10370))

Included in the following conference series:

Abstract

Linked Data provides access to huge, continuously growing amounts of open data and ontologies in RDF format that describe entities, links and properties on those entities. Equipping Linked Data with inference paves the way to make the Semantic Web a reality. In this survey, we describe a unifying framework for RDF ontologies and databases that we call deductive RDF triplestores. It consists in equipping RDF triplestores with Datalog inference rules. This rule language allows to capture in a uniform manner OWL constraints that are useful in practice, such as property transitivity or symmetry, but also domain-specific rules with practical relevance for users in many domains of interest. The expressivity and the genericity of this framework is illustrated for modeling Linked Data applications and for developing inference algorithms. In particular, we show how it allows to model the problem of data linkage in Linked Data as a reasoning problem on possibly decentralized data. We also explain how it makes possible to efficiently extract expressive modules from Semantic Web ontologies and databases with formal guarantees, whilst effectively controlling their succinctness. Experiments conducted on real-world datasets have demonstrated the feasibility of this approach and its usefulness in practice for data integration and information extraction.

This work has been partially supported by the ANR projects Pagoda (12-JS02-007-01) and Qualinca (12-CORD-012), the joint NSFC-ANR Lindicle project (12-IS01-0002), and LabEx PERSYVAL-Lab (11-LABX-0025-01).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://jena.apache.org/documentation/inference/.

  2. 2.

    We have slightly modified the INA vocabulary (e.g. translating French terms into English terms) for the sake of readability.

  3. 3.

    http://lucene.apache.org/solr/.

  4. 4.

    http://wiki.dbpedia.org/Downloads2015-04.

  5. 5.

    http://linkedbrainz.org/.

  6. 6.

    We only consider rules that conclude to sameAs statements because other rules can be handled with preprocessing by tools like Silk or LIMES.

  7. 7.

    fma.biostr.washington.edu.

  8. 8.

    www.mycorporisfabrica.org.

  9. 9.

    www.ihtsdo.org/snomed-ct.

  10. 10.

    www.dbpedia.org.

  11. 11.

    www.cs.ox.ac.uk/isg/tools/ModuleExtractor/.

  12. 12.

    http://mycorporisfabrica.org/mycf/.

References

  1. Abiteboul, S., Abrams, Z., Haar, S., Milo, T.: Diagnosis of asynchronous discrete event systems: datalog to the rescue! In: Proceedings of the Twenty-Fourth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, 13–15 June 2005, Baltimore, pp. 358–367. ACM (2005)

    Google Scholar 

  2. Abiteboul, S., Hull, R., Vianu, V.: Foundations of Databases. Addison-Wesley, Reading (1995)

    MATH  Google Scholar 

  3. Al-Bakri, M., Atencia, M., David, J., Lalande, S., Rousset, M.-C.: Uncertainty-sensitive reasoning for inferring sameAS facts in linked data. In: Proceedings of the European Conference on Artificial Intelligence (ECAI 2016), August 2016, The Hague (2016)

    Google Scholar 

  4. Al-Bakri, M., Atencia, M., Lalande, S., Rousset, M.-C.: Inferring same-as facts from linked data: an iterative import-by-query approach. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 25–30 January 2015, Austin, pp. 9–15. AAAI Press (2015)

    Google Scholar 

  5. Allemang, D., Hendler, J.: Semantic Web for the Working Ontologist: Modeling in RDF, RDFS and OWL. Morgan Kaufmann, San Francisco (2011)

    Google Scholar 

  6. Amarilli, A., Bourhis, P., Senellart, P.: Provenance circuits for trees and treelike instances. In: Halldórsson, M.M., Iwama, K., Kobayashi, N., Speckmann, B. (eds.) ICALP 2015. LNCS, vol. 9135, pp. 56–68. Springer, Heidelberg (2015). doi:10.1007/978-3-662-47666-6_5

    Chapter  Google Scholar 

  7. Arasu, A., Ré, C., Suciu, D.: Large-scale deduplication with constraints using dedupalog. In: Proceedings of the 25th International Conference on Data Engineering, ICDE 2009, 29 March 2009–2 April 2009, Shanghai, pp. 952–963. IEEE Computer Society (2009)

    Google Scholar 

  8. Arenas, M., Gottlob, G., Pieris, A.: Expressive languages for querying the semantic web. In: Proceedings of the International Conference on Principles of Database Systems (PODS 2014) (2014)

    Google Scholar 

  9. Atencia, M., Al-Bakri, M., Rousset, M.-C.: Trust in networks of ontologies and alignments. J. Knowl. Inf. Syst. (2013). doi:10.1007/s10115-013-0708-9

  10. Atencia, M., David, J., Euzenat, J.: Data interlinking through robust linkkey extraction. In: ECAI 2014 - 21st European Conference on Artificial Intelligence, 18–22 August 2014, Prague, - Including Prestigious Applications of Intelligent Systems (PAIS 2014). Frontiers in Artificial Intelligence and Applications, vol. 263, pp. 15–20. IOS Press (2014)

    Google Scholar 

  11. Atencia, M., David, J., Scharffe, F.: Keys and pseudo-keys detection for web datasets cleansing and interlinking. In: Teije, A., et al. (eds.) EKAW 2012. LNCS (LNAI), vol. 7603, pp. 144–153. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33876-2_14

    Chapter  Google Scholar 

  12. Bröcheler, M., Mihalkova, L., Getoor, L.: Probabilistic similarity logic. In: Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence, UAI 2010, Catalina Island, 8–11 July 2010, pp. 73–82. AUAI Press (2010)

    Google Scholar 

  13. Calì, A., Gottlob, G., Lukasiewicz, T.: A general datalog-based framework for tractable query answering over ontologies. J. Web Semant. 14, 57–83 (2012)

    Article  Google Scholar 

  14. Calvanese, D., De Giacomo, G., Lembo, D., Lenzerini, M., Rosati, R.: Tractable reasoning and efficient query answering in description logics: the DL-Lite family. J. Autom. Reason. 39(3), 385–429 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  15. Chandra, A.K., Merlin, P.M.: Optimal implementation of conjunctive queries in relational databases. In: Proceedings of the 9th ACM Symposium on Theory of Computing, pp. 77–90 (1975)

    Google Scholar 

  16. Christen, P.: Data Matching - Concepts and Techniques for Record Linkage, Entity Resolution, and Duplicate Detection. Data-Centric Systems and Applications. Springer, Heidelberg (2012)

    Google Scholar 

  17. Dalvi, N., Suciu, D.: The dichotomy of probabilistic inference for unions of conjunctive queries. J. ACM 59(6), 17–37 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  18. De Giacomo, G., Lenzerini, M., Rosati, R.: Higher-order description logics for domain metamodeling. In: Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence (AAAI-11) (2011)

    Google Scholar 

  19. Euzenat, J., Shvaiko, P.: Ontology Matching, 2nd edn. Springer, Heidelberg (2013)

    Book  MATH  Google Scholar 

  20. Ferrara, A., Nikolov, A., Scharffe, F.: Data linking for the semantic web. Int. J. Semant. Web Inf. Syst. 7(3), 46–76 (2011)

    Article  Google Scholar 

  21. Forgy, C.: Rete: a fast algorithm for the many patterns/many objects match problem. Artif. Intell. 19(1), 17–37 (1982)

    Article  Google Scholar 

  22. Fuhr, N.: Probabilistic models in information retrieval. Comput. J. 3(35), 243–255 (1992)

    Article  MATH  Google Scholar 

  23. Fuhr, N.: Probabilistic datalog: implementing logical information retrieval for advanced applications. J. Am. Soc. Inf. Sci. 51(2), 95–110 (2000)

    Article  MathSciNet  Google Scholar 

  24. Cuenca Grau, B., Horrocks, I., Kazakov, Y., Sattler, U.: Modular reuse of ontologies: theory and practice. J. Artif. Intell. Res. (JAIR-08) 31, 273–318 (2008)

    MathSciNet  MATH  Google Scholar 

  25. Grau, B.C., Motik, B.: Reasoning over ontologies with hidden content: the import-by-query approach. J. Artif. Intell. Res. (JAIR) 45, 197–255 (2012)

    MathSciNet  MATH  Google Scholar 

  26. Heath, T., Bizer, C.: Linked Data: Evolving the Web into a Global Data Space. Morgan and Claypool, Palo Alto (2011)

    Google Scholar 

  27. Herre, H.: General formal ontology (GFO): a foundational ontology for conceptual modelling. In: Poli, R., Healy, M., Healy, A. (eds.) Theory and Applications of Ontology, vol. 2, pp. 297–345. Springer, Berlin (2010)

    Chapter  Google Scholar 

  28. Hillebrand, G.G., Kanellakis, P.C., Mairson, H.G., Vardi, M.Y.: Undecidable boundedness problems for datalog programs. J. Log. Program. (JLP-95) 25, 163–190 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  29. Hinkelmann, K., Hintze, H.: Computing cost estimates for proof strategies. In: Dyckhoff, R. (ed.) ELP 1993. LNCS, vol. 798, pp. 152–170. Springer, Heidelberg (1994). doi:10.1007/3-540-58025-5_54

    Chapter  Google Scholar 

  30. Hoehndorf, R., Ngonga Ngomo, A.-C., Kelso, J.: Applying the functional abnormality ontology pattern to anatomical functions. J. Biomed. Semant. 1(4), 1–15 (2010)

    Google Scholar 

  31. Hogan, A., Zimmermann, A., Umbrich, J., Polleres, A., Decker, S.: Scalable and distributed methods for entity matching, consolidation and disambiguation over linked data corpora. J. Web Semant. 10, 76–110 (2012)

    Article  Google Scholar 

  32. Konev, B., Lutz, C., Walther, D., Wolter, F.: Semantic modularity and module extraction in description logics. In: Proceedings of the European Conference on Artificial Intelligence (ECAI-08) (2008)

    Google Scholar 

  33. Masolo, C., Borgo, S., Gangemi, A., Guarino, N., Oltramari, A., Schneider, L.: Wonder-web deliverable D17. The WonderWeb library of foundational ontologies and the DOLCE ontology. Technical report, ISTC-CNR (2002)

    Google Scholar 

  34. Ngonga Ngomo, A.-C., Auer, S.: LIMES - a time-efficient approach for large-scale link discovery on the web of data. In: Proceedings of the 22nd International Joint Conference on Artificial Intelligence, IJCAI 2011, Barcelona, 16–22 July 2011, pp. 2312–2317. IJCAI/AAAI (2011)

    Google Scholar 

  35. Noy, N.F., Musen, M.A.: Specifying ontology views by traversal. In: McIlraith, S.A., Plexousakis, D., Harmelen, F. (eds.) ISWC 2004. LNCS, vol. 3298, pp. 713–725. Springer, Heidelberg (2004). doi:10.1007/978-3-540-30475-3_49

    Chapter  Google Scholar 

  36. Palombi, O., Ulliana, F., Favier, V., Rousset, M.-C.: My Corporis Fabrica: an ontology-based tool for reasoning and querying on complex anatomical models. J. Biomed. Semant. (JOBS 2014) 5, 20 (2014)

    Article  Google Scholar 

  37. Rabattu, P.-Y., Masse, B., Ulliana, F., Rousset, M.-C., Rohmer, D., Leon, J.-C., Palombi, O.: My Corporis Fabrica embryo: an ontology-based 3D spatio-temporal modeling of human embryo development. J. Biomed. Semant. (JOBS 2015) 6, 36 (2015)

    Article  Google Scholar 

  38. Rosse, C., Mejino, J.L.V.: A reference ontology for biomedical informatics: the foundational model of anatomy. J. Biomed. Inform. 36, 500 (2003)

    Article  Google Scholar 

  39. Rousset, M.-C., Ulliana, F.: Extractiong bounded-level modules from deductive triplestores. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 25–30 January 2015, Austin. AAAI Press (2015)

    Google Scholar 

  40. Saïs, F., Pernelle, N., Rousset, M.-C.: Combining a logical and a numerical method for data reconciliation. J. Data Semant. 12, 66–94 (2009)

    Article  Google Scholar 

  41. Singla, P., Domingos, P.M.: Entity resolution with Markov logic. In: Proceedings of the 6th IEEE International Conference on Data Mining (ICDM 2006), 18–22 December 2006, Hong Kong, pp. 572–582. IEEE Computer Society (2006)

    Google Scholar 

  42. Suchanek, F.M., Kasneci, G., Weikum, G.: Yago: a core of semantic knowledge. In: Proceedings of the World Wide Web Conference (WWW-07) (2007)

    Google Scholar 

  43. Suciu, D., Olteanu, D., Ré, C., Koch, C.: Probabilistic Databases. Morgan & Claypool, San Francisco (1995)

    MATH  Google Scholar 

  44. Symeonidou, D., Armant, V., Pernelle, N., Saïs, F.: SAKey: scalable almost key discovery in RDF data. In: Mika, P., et al. (eds.) ISWC 2014. LNCS, vol. 8796, pp. 33–49. Springer, Cham (2014). doi:10.1007/978-3-319-11964-9_3

    Google Scholar 

  45. Tournaire, R., Petit, J.-M., Rousset, M.-C., Termier, A.: Discovery of probabilistic mappings between taxonomies: principles and experiments. J. Data Semant. 15, 66–101 (2011)

    Article  Google Scholar 

  46. Urbani, J., Harmelen, F., Schlobach, S., Bal, H.: QueryPIE: backward reasoning for OWL horst over very large knowledge bases. In: Aroyo, L., et al. (eds.) ISWC 2011. LNCS, vol. 7031, pp. 730–745. Springer, Heidelberg (2011). doi:10.1007/978-3-642-25073-6_46

    Chapter  Google Scholar 

  47. Vieille, L.: Recursive axioms in deductive databases: the query/subquery approach. In: Expert Database Conference, pp. 253–267 (1986)

    Google Scholar 

  48. Volz, J., Bizer, C., Gaedke, M., Kobilarov, G.: Silk - a link discovery framework for the web of data. In: Proceedings of the WWW 2009 Workshop on Linked Data on the Web, LDOW 2009, Madrid, 20 April 2009, vol. 538. CEUR Workshop Proceedings. CEUR-WS.org (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marie-Christine Rousset .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this chapter

Cite this chapter

Rousset, MC., Atencia, M., David, J., Jouanot, F., Palombi, O., Ulliana, F. (2017). Datalog Revisited for Reasoning in Linked Data. In: Ianni, G., et al. Reasoning Web. Semantic Interoperability on the Web. Reasoning Web 2017. Lecture Notes in Computer Science(), vol 10370. Springer, Cham. https://doi.org/10.1007/978-3-319-61033-7_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-61033-7_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-61032-0

  • Online ISBN: 978-3-319-61033-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics