Skip to main content
Log in

A Model for Digital Libraries and its Translation to RDF

  • Original Article
  • Published:
Journal on Data Semantics

Abstract

With the advent of the Web, the traditional concept of library has undergone a profound change: from a collection of physical information resources (mostly books) to a collection of digital resources. In addition, the notion of digital resource includes not only texts in digital form, but also, in general, any kind of multimedia resources. In a traditional library, physical information resources are managed through well-understood manual procedures, whereas in a digital library digital resources are organized according to a data model, discovered through a query language and managed in a highly automated way. In this paper, we present a data model and query language for digital libraries supporting identification, structuring, metadata support, re-use and discovery of digital resources. The model that we propose is inspired by the Web and it is formalized as a first-order theory, certain models of which correspond to the notion of digital library. Additionally, we provide a full translation of the model in RDF and of the query language in SPARQL.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. http://www.europeana.eu.

  2. In the context of the web, the act of applying the reference function to an identifier \(i\) in order to obtain a representation of the identified resource, is called de-referencing i.

  3. http://www.viaf.org

  4. http://www.getty.edu/research/tools/vocabularies/tgn/index.html.

  5. http://www.openarchives.org/ore/.

  6. MARC stands for MAchine Readable Cataloguing, and is a very popular metadata format, adopted, amongst others, by the Library of Congress. ISO Standard 2709 is based on MARC.

  7. http://www.culture.gouv.fr/documentation/joconde/fr/pres.htm.

  8. http://www.europeana.eu/portal/usingeuropeana_search.html.

  9. http://www.google.ca/advanced_search.

  10. Recall from Sect. 4.5 that we use the terms “interpretation” and “metadata base” interchangeably.

  11. http://www.delos.info/.

  12. http://www.dspace.org/.

  13. http://www.greenstone.org/.

References

  1. Greenstone digital library software. http://www.greenstone.org/

  2. Baader F, Calvanese D, McGuinness DL, Nardi D, Patel-Schneider PF (eds) (2003) The description logic handbook: theory, implementation, and applications, 2nd edn. Cambridge University Press, Cambridge

    Google Scholar 

  3. Berners-Lee T, Fielding R, Masinter L (2005) Uniform resource identifiers (URI): generic syntax. RFC 3986, The Internet Engineering Task Force, Network Working Group. http://www.ietf.org/rfc/rfc3986.txt

  4. Berners-Lee T, Hendler J, Lassila O (2001) The semantic web. Scientific American Magazine

  5. Brickley D, Guha RV (2004) RDF vocabulary description language 1.0: RDF schema. W3C Recommendation, WWW Consortium. http://www.w3.org/TR/rdf-schema/

  6. Candela L, Castelli D, Ferro N, Koutrika G, Meghini C, Ioannidis Y, Pagano P, Ross S, Soergel D, Agosti M, Dobreva M, Katifori V, Schuldt H (2007) The DELOS Digital Library Reference Model–foundations for digital libraries. DELOS Network of Excellence on Digital Libraries. ISBN 2-912337-37-X

  7. Candela L, Castelli D, Ioannidis Y, Koutrika G, Pagano P, Ross S, Schek H-J, Schuldt H (2007) Setting the foundations of digital libraries the delos manifesto. D-Lib Mag 13(3/4)

  8. Castagna G (1995) Covariance and contravariance: conflict without a cause. ACM Trans Program Lang Syst 17(3):431–447

    Article  Google Scholar 

  9. Cohen E (1997) Size-estimation framework with applications to transitive closure and reachability. J Comput Syst Sci 55:441–453

    Article  MATH  Google Scholar 

  10. Dantsin E, Eiter TH, Gottlob G, Voronkov A (2001) Complexity and expressive power of logic programming. ACM Comput Surv 33(3):374–425

    Article  Google Scholar 

  11. Van de Sompel H, Bekaert J, Liu X, Balakireva L, Schwander T (2005) aDORe: a modular, standards-based digital object repository. Comput J 48(5):514–535

    Article  Google Scholar 

  12. Doerr M (2003) The CIDOC conceptual reference model: an ontological approach to semantic interoperability of metadata. AI Magazine 24(3):75–92

    MathSciNet  Google Scholar 

  13. Goncalves MA, Fox EA, Watson LT, Kipp NA (2004) Streams, structures, spaces, scenarios, societies (5s): a formal model for digital libraries. ACM Trans Inf Syst 22(2):270–312

    Article  Google Scholar 

  14. Halpin H, Valentina P (2009) An ontology of resources for linked data. In Proceedings of LDOW 2009, linked data on the web, WWW2009 workshop, Madrid, Spain. http://wtlab.um.ac.ir/parameters/wtlab/filemanager/LD_resources/LDOW2009/ldow2009_paper19.pdf

  15. Harris S, Seaborne A (2011) SPARQL 1.1 Query Language. W3c working draft. http://www.w3.org/TR/sparql11-query/

  16. Hayes P (2006) In defense of ambiguity. In Proceedings of IRW2006, the identity, reference, and the web, WWW2006 workshop, Edinburgh, United Kingdom. http://www.ibiblio.org/hhalpin/homepage/publications/indefenseofambiguity.html

  17. Hayes P (2004) RDF semantics. W3C recommendation, WWW consortium. http://www.w3.org/TR/rdf-mt/

  18. Heath T, Bizer C (2011) Linked data. Evolving the web into a global data space. Morgan & Claypool, San Rafael

  19. Dublin Core Metadata Initiative (2008) Dublin core metadata initiative dublin core metadata element set, version 1.1. http://dublincore.org/documents/dces/

  20. Jacobs I, Walsh N (2004) Architecture of the World Wide Web, vol 1. W3C Recommendation, WWW Consortium. http://www.w3.org/TR/webarch/

  21. Klyne G, Carroll JJ (2004) Resource description framework (RDF): concepts and abstract syntax. W3C Recommendation, WWW Consortium. http://www.w3.org/TR/rdf-concepts/

  22. Lagoze C, Payette S, Shin E, Wilper C (2006) Fedora: an architecture for complex objects and their relationships. Int J Digit Lib 6:124–138

    Article  Google Scholar 

  23. Lloyd JW (1987) Foundations of logic programming. Springer, Berlin

    Book  MATH  Google Scholar 

  24. Manola F, Miller E (2004) RDF Primer. W3C Recommendation, WWW Consortium. http://www.w3.org/TR/rdf-primer/

  25. Meghini C, Doerr M, Spyratos N (2009) Managing co-reference knowledge for data integration. In: Kiyoki Y, Tokuda T, Jaakkola H, Chen X, Yoshida N (eds) Information modelling and knowledge bases XX. In: Frontiers in artificial intelligence and applications, vol 190. IOS Press, Amsterdam

  26. Meghini C, Sebastiani F, Straccia U, Thanos C (1993) A model of information retrieval based on a terminological logic. In Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR 1993)

  27. Munoz-Venegas S, Perez J, Gutierrez C (2009) Simple and efficient minimal RDFS. J Web Seman 7:220–234

    Article  Google Scholar 

  28. Perez J, Arenas M, Gutierrez C (2009) Semantics and complexity of SPARQL. ACM Trans Database Syst (TODS) 34(3)

  29. Prud’hommeaux E, Seaborne A (2008) Sparql query language for RDF. W3C Recommendation. http://www.w3.org/TR/rdf-sparql-query/

  30. Rigaux P, Spyratos N (2004) Metadata inference for document retrieval in a distributed repository. In Proceedings of ASIAN’04, The 9th Asian computing science conference. Lecture notes in computer science, vol 3321. Chiang-Mai, Thailand. Springer, Berlin

  31. Smith M, Barton M, Bass M, Branschofsky M, McClellan G, Stuve D, Tansley R, Walker JH (2003) Dspace an open source dynamic digital repository. D-Lib Mag 9(1)

  32. ter Horst HJ (2005) Completeness, decidability and complexity of entailment for rdf schema and a semantic extension involving the OWL vocabulary. J Web Seman 3:79–115

    Article  MathSciNet  Google Scholar 

  33. Presutti V, Gangemi A (2008) Identity of resources and entities on the web. Int J Semant Web Inf Syst 4(2)

  34. Yang J (2012) A data model for digital libraries. PhD thesis, École doctorale: Informatique de Paris-Sud. Spécialité: Informatique

Download references

Acknowledgments

This work was partially supported by the following sources: (1) EU project ASSETS (“Advanced Service Search and Enhancing Technological Solutions for the European Digital Librar”, CIP-ICT PSP-2009-3, Grant Agreement n. 250527), and (2) CNRS International Collaboration Project (PICS 5220). We also thank Europeana for building the ideal forum for discussing the making of a digital library.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Carlo Meghini.

Appendices

Appendix A: Translating Identifiers to URI References

RDF supports three kinds of terms [21]: URI references, literals and blank nodes. URI references (URIrefs) [3] are resource identifiers, thus they naturally correspond to the identifiers of our model. We will therefore translate identifiers of our model as URIrefs. To this end, we assume a special namespace for the URIrefs resulting from the translation of identifiers. We shall call this namespace N, without entering in the details of its actual syntax.

Conceptually, if no URIref is used as an identifier, the translation is trivial: identifiers of \(I\) are mapped to URIrefs in the namespace N in an injective manner. However, users may want to re-use URIrefs from existing namespaces as identifiers in the metadata base, for interoperability reasons. We leave this freedom to the user, but we require the user to respect the following three constraints that guarantee the injectivity of the translation:

  1. 1.

    No URIref in the RDFDL namespace can be used as an identifier in the metadata base; we shall call this the id constraint;

  2. 2.

    If a URIref is used as a schema identifier in the metadata base, then it must be a syntactically valid URIref terminating with the sharp symbol “#”. Moreover, if a URIref is used as a class or property identifier in the metadata base, then it must be a fragment identifier. This constraint will guarantee that in the RDF translation of an interpretation, we can compose schema identifiers with the identifiers of classes or properties. We note that since a URIref can contain at most one sharp symbol, we can recover from such a URIref the original schema and class (or property) identifiers.

  3. 3.

    If a URIref \(s\) is used as a schema identifier in the metadata base and a URIref \(f\) is used as class or property identifier, then their concatenation (:) is not in the metadata base, namely: \(s\hbox {:}f \not \in {\mathsf {ID}}.\)

Technically, the translation of metadata base identifiers to URIrefs is carried out by the functions \(\underline{\cdot }_s, \,\underline{\cdot }_f, \,\underline{\cdot }_d, \,\underline{\cdot }_{cr}\) and \(\underline{\cdot },\) whose definitions rely on the following auxiliary functions \(\phi _s, \,\phi _f, \,\phi _d, \,\phi _{cr}\) and \(\phi :\)

  • \(\phi _s\) is an injective function that translates a given non-URIref schema identifier to a URIref with suffix “#” in the namespace N;

  • \(\phi _f\) is an injective function that translates a given non-fragment class or property identifier to a fragment identifier in the namespace N;

  • \(\phi _d\) is an injective function that translates a given non-URIref description identifer to a URIref in the namespace N;

  • \(\phi _{cr}\) is an injective function that translates a given non-URIref composable resource identifier into a URIref in the namespace N.

Injectivity requires not to generate the same URIref from two different identifiers in the metadata base. This can be realized in several ways, for instance by incrementing a counter. However, we shall not enter into technical details here.

Based on the above functions, we now define the functions \(\underline{\cdot }_s, \,\underline{\cdot }_f, \,\underline{\cdot }_d, \,\underline{\cdot }_{cr}\), as follows:

What each of these functions essentially does is to generate a correct URIref from the input identifier if this identifier is not already a (correct) URIref. By their definition, these functions are injective because the auxiliary \(\phi \) functions are injective.

We note that, sharp (#) is not universally used in the URI of existing vocabularies, for instance, the URI of the class “Bird” in DBPedia is http://dbpedia.org/ontology/Bird. As a consequence, http://dbpedia.org/ontology/Bird will be mapped to a different URI \(u\) in the namespace N by function \(\phi _{cr}.\) This is not ideal; however, it is important to note that there is no universally accepted convention for naming resources in RDF (apart from the rules of the URI syntax), therefore this kind of mismatch will occur for certain classes of URIs, no matter what convention we rely upon for the translation in RDF. What is important is that the formal properties of the translation are not broken.

Finally, for convenience of notation, we define a generic function \(\underline{\cdot }\) that, given an input identifier \(i,\) applies to \(i\) one the translation functions defined above depending on the type of \(i:\)

$$\begin{aligned} \underline{i}&= \left\{ \begin{array}{ll} \underline{i}_s &{} \hbox {if i is a schema identifier} \\ \underline{i}_f &{} \hbox {if i is a class or a property identifier} \\ \underline{i}_d &{} \hbox {if i is a metadata identifier} \\ \underline{i}_{cr} &{} \hbox {if i is a composable resource identifier}\\ \end{array} \right. \end{aligned}$$

Notice that the above function is injective, as a consequence of the injectivity of its defining functions.

Appendix B: Proofs

Lemma 1

(Soundness) Let \(G\) and \(H\) be RDFDL-graphs (i.e. \(G\) and \(H\) do not mention RDFS vocabulary outside the RDFDL vocabulary). If \(G \vdash _{RDFDL} H,\) then \(G \models _{RDFDL} H.\)

Proof

Let \(I\) be an RDFDL-interpretation (\(i.e. ~I\) satisfies all the RDFDL semantic conditions) such that \(I \models _{RDFDL} G,\) we must show that \(I \models _{RDFDL} H.\) We know from [27], this has been proved for rules (1b) to (7). Therefore we need to show it for rules (RDFDL-1) and (RDFDL-2).

  • (RDFDL-1); Consider any triple \(t = (\)x a y\()\in H,\) which is derived from rule (RDFDL-1). Then there must be three triples \(\{\)(u rdfdl:isDescriptionOf x), (u rdfdl:hasProxy z), (z a y)\(\}\) in \(G\) from which \(t\) has been derived (because \(G \vdash _{RDFDL} H\)). As \(I \models _{RDFDL} G,\) then \(I \models _{RDFDL} \{\)(u rdfdl:isDescriptionOf x), (u rdfdl:hasProxy z), (z a y)\(\}.\) By semantic condition (SC-1), \(I \models _{RDFDL} t.\) As this is true for every \(t \in H,\) we have that \(I \models _{RDFDL} H.\)

  • (RDFDL-2); Consider any triple \(t \!=\! (\)u rdfdl:isDescriptionOf r\() \in H,\) which is derived from rule (RDFDL-2). Then there must be two triples \(\{\)(u rdfdl:isDescriptionOf x), (x rdfdl:isPartOf y)\(\}\) in \(G\) from which \(t\) has been derived (because \(G \vdash _{RDFDL} H\)). As \(I \models _{RDFDL} G,\) then \(I \models _{RDFDL} \{\)(u rdfdl:isDescriptionOf x), (x rdfdl:isPartOf y)\(\}.\) By semantic condition (SC-2), \(I \models _{RDFDL} t.\) As this is true for every \(t \in H,\) we have that \(I \models _{RDFDL} H.\)

\(\square \)

To prove the completeness of the set of rules, we first introduce the following notion of RDFDL closure of a graph. Define the graph \({RDFDL}\)-\(cl(G)\) as the closure of \(G\) under the application of rules (2)-(7), (RDFDL-1) and (RDFDL-2). Note that \({RDFDL}\)-\(cl(G)\) is an RDF graph over terms \((G) \cup {RDFDL} ~vocabulary\), that is a superset of \(G,\) and that is obtained after a finite number of application of rules. The notion of closure of a graph as well as the following lemma, are borrowed from [27].

Lemma 2

Given a RDFDL-graph \(G\) that does not mention RDFS vocabulary outside RDFDL vocabulary, define the interpretation \(I_G \!=\! (Res, \, Prop, \, Class, \, Ext, \, CExt, \, Int)\) such that:

  • Res = \(terms(G) \cup {RDFDL} {~vocabulary}.\)

  • Prop = \(\{p\in vocabulary(G) ~|~ (s ~p ~o) \in {RDFDL}\)-\(cl(G)\} \, \cup ~{RDFDL} \, {vocabulary} \, \cup ~\{p \in terms(G) ~|~ (p\) subPropertyOf \(x), (y\) subPropertyOf \(p), (p\) domain \(z),\) or \((p\) range \(v) \in G\}.\)

  • Class = \(\{c \in terms(G) ~|~ (x\) type \(c) \in G\} \, \cup ~\{c \in terms(G) \, | \, (c\) subClassOf \(x), (y\) subClassOf \(c), (z\) domain \(c),\) or \((v\) range \(c)\in G\}.\)

  • Ext: Prop \(\rightarrow 2^{Res\times Res}\) the extension function such that: if \(p \in \) URIrefs \(\cap ~Prop\) then \(Ext(p) = \, \{(s~ o) \, | \, (s ~p ~o) \, \in {RDFDL}\)-\(cl(G)\}.\)

  • CExt: Class \(\rightarrow 2^{Res}\) a function such that: \(CExt(c) = \{x \in terms(G) ~|~ \, (x\) type \(c) \in {RDFDL}\)-\(cl(G)\}.\)

  • Int, the identity function over \(terms(G) \, \cup \, {RDFDL} vocabulary.\)

Then for every RDFDL-graph G, we have that \(I_G \models _{RDFDL} G.\)

Proof

See Lemma 32 in [27]. \(\square \)

Lemma 3

(Completeness) Let G, H be RDFDL-graphs that do not mention RDFS vocabulary outside RDFDL vocabulary. If \(G \models _{RDFDL} H,\) then \(G \vdash _{RDFDL} H.\)

Proof

It is sufficient to show that if \(t \in H,\) then \(t \in RDFDL\)-\(cl\{G\}.\) We know from [24], this has been proved for rules (1b) to (7). Therefore we need to show it for rules (RDFDL-1) and (RDFDL-2). Let \(I_G \!= (Res, \, Prop, \, Class, Ext, \, CExt, \, Int)\) as defined in Lemma 2, then \(I_G \models _{RDFDL} G.\)

  • (RDFDL-1); Consider a triple \(t = (\)x a y\() \in H,\) which is entailed from semantic condition (SC-1). Then we must have three triples \(\{\)(u rdfdl:isDescriptionOf x), (u rdfdl:hasProxy z), (z a y)\(\}\) in \(G,\) and as \(I_G \models _{RDFDL} G,\) therefore \(I_G \models _{RDFDL} \{\)(u rdfdl:isDescriptionOf x), (u rdfdl:hasProxy z), (z a y)\(\}.\) By (RDFDL-1), \( I_G \models _{RDFDL} t,\) namely \(t \in {RDFDL}\)-\(cl(G).\)

  • (RDFDL-2); let triple \(t = (\)u rdfdl:isDescriptionOf r\() \in H,\) which is entailed from semantic condition (SC-2). Then there must have two triples \(\{\)(u rdfdl:isDescriptionOf x), (x rdfdl:isPartOf y)\(\}\) in \(G,\) and as \(I_G \, \models _{RDFDL} G,\) therefore \(I_G \models _{RDFDL} \{\)(u rdfdl:isDescriptionOf x), (x rdfdl:isPartOf y)\(\}.\) By (RDFDL-2), \( I_G \models _{RDFDL} t,\) namely \(t \in {RDFDL}\)-\(cl(G).\) \(\square \)

Proposition 2

The function \(\Phi \) is injective.

To prove the injective of \(\Phi ,\) we separate the proof in three steps:

1) First, we prove that the variable identification is injective.

Proof

As the functions \(\underline{\cdot }_s, \,\underline{\cdot }_f, \,\underline{\cdot }, \,\underline{\cdot }_{cr}\); and \(\underline{\cdot }\) are injective, and for any schema identifier \(s,\) fragment identifier \(f,\) description identifier \(d\) and composable resource identifier \(i\) in \(I\), \( ~\underline{s}\hbox {:}\underline{f} \ne \underline{d}, ~\underline{s}\hbox {:}\underline{f} \ne \underline{i}. \) Therefore the variable identification is injective. \(\square \)

2) Second, we prove that \(\varphi \) (in Table 3) is injective.

Proof

From Table 3,

  • if \(e = {\mathsf {SchCl}}(s,c)\); \(\varphi (e) = \{(\underline{s}\):\(\underline{c}\) rdf:type rdfs:Class)\(\},\) then the possible equivalent triples generated by \(\varphi \) are \(\{(\xi (d)\) rdf:type \(\underline{s}\):\(\underline{c})\},\) or \(\{(\xi (d) \, \underline{s}\):\(\underline{p} \, \underline{i})\},\) or \(\{(\underline{i}\) rdf:type \(\underline{s}\):\(\underline{c})\},\) or \(\{(\underline{i} \, \underline{s}\):\(\underline{p} \, \underline{j})\}.\) Because of the id constraint in Sect. 6.2, namely \(\underline{s}\):\(\underline{c} \, \ne \) rdfs:Class, \(\underline{s}\):\(\underline{p} \ne \) rdf:type; and as the functions \(\underline{\cdot }_s\) and \(\underline{\cdot }_f\) are injective. So from \(e\) we can get unique \(\varphi (e) = \{(\underline{s}\):\(\underline{c}\) rdf:type rdfs:Class)\(\}.\) In the same manner, we can prove if \(e = {\mathsf {SchPr}}(s,p),\) as \(\underline{s}\):\(\underline{c} \, \ne \) rdf:Property, \(\underline{s}\):\(\underline{p} \ne \) rdf:type, and the functions \(\underline{\cdot }_s\) and \(\underline{\cdot }_f\) are injective. So from \(e\) we can get unique \(\varphi (e) = \{(\underline{s}\):\(\underline{c}\) rdf:type rdf:Property)\(\}.\)

  • if \(e = {\mathsf {Dom}}(s,p,c); \, \varphi (e) = \{(\underline{s}\):\(\underline{p}\) rdfs:domain \(\underline{s}\):\(\underline{c})\},\) then the possible equivalent triples generated by \(\varphi \) are \(\{(\xi (d) \, \underline{s}\):\(\underline{p} \, \underline{i})\},\) or \(\{(\underline{i} \, \underline{s}\):\(\underline{p} \, \underline{j})\}.\) Because of the id constraint in Sect. 6.2, namely \(\underline{s}\):\(\underline{p} \ne \) rdfs:domain; and as the functions \(\underline{\cdot }_s\) and \(\underline{\cdot }_f\) are injective. So, from \(e\) we can get unique \(\varphi (e) = \{(\underline{s}\):\(\underline{p}\) rdfs:domain \(\underline{s}\):\(\underline{c})\}.\) In the same manner, we can prove that if \(e \in \{{\mathsf {Ran}}(s,p,c), \, {\mathsf {IsaCl}}(s, c_1, c_2), {\mathsf {IsaPr}}\, (s, p_1, p_2), \, {\mathsf {PartOf}}(i,j), \, {\mathsf {DescOf}}(d,i)\},\) as \(\underline{s}\):\(\underline{p} \notin \{\)rdfs:range, rdfs:subClassOf, rdfs:subPropertyOf, rdfdl:isPartOf, rdfdl:isDescriptionOf\(\},\) and the functions \(\underline{\cdot }_s, \, \underline{\cdot }_f, \, \underline{\cdot }_d, \, \underline{\cdot }_{cr}\) and \(\underline{\cdot }\) are injective, then from \(e\) we can get unique \(\varphi (e)\) respectively.

  • if \(e = {\mathsf {DescCl}}(d,s,c); \, \varphi (e) =\{(\underline{d}\) rdfdl:hasProxy \(\xi (d)), \, (\xi (d)\) rdf:type \(\underline{s}\):\(\underline{c})\},\) then for \((\xi (d)\) rdf:type \(\underline{s}\):\(\underline{c})\) the possible equivalent triples generated by \(\varphi \) are \(\{(\underline{s}\):\(\underline{c}\) rdf:type rdfs:Class\()\},\) or \(\{(\underline{s}\):\(\underline{p}\) rdf:type rdf:Property\()\},\) or \(\{(\xi (d) \, \underline{s}\):\(\underline{p} \, \underline{i})\},\) or \(\{(\underline{i}\) rdf:type \(\underline{s}\):\(\underline{c})\},\) or \(\{(\underline{i} \, \underline{s}\):\(\underline{p} \, \underline{j})\}.\) Because of the RDFDL constraint in Sect. 6.2, namely \(\underline{s}\):\(\underline{c} \, \notin \{\)rdfs:Class, rdf:Property\(\}, \, \underline{s}\):\(\underline{p} \ne \) rdf:type; and for any description \(d\) and any identifier \(i_1\) or schema identifier \(i_2\) and fragment identifier \(i_3\) in \(I, \, \xi (d) \ne \underline{i_1}\) and \(\xi (d) \ne \underline{i_2}\):\(\underline{i_3},\) the functions \(\xi , \, \underline{\cdot }_s, \, \underline{\cdot }_f\) and \(\underline{\cdot }_d\) are injective. Then from \(e\) we can get unique \(\varphi (e) =\{(\underline{d}\) rdfdl:hasProxy \(\xi (d)), \, (\xi (d)\) rdf:type \(\underline{s}\):\(\underline{c})\}\). In the same manner, we can prove if \(e = {\mathsf {ClInst}}(i,s,c),\) then from \(e\) we can get unique \(\varphi (e) = \{i\) rdf:type \(\underline{s}\):\(\underline{c}\}.\)

  • if \(e = {\mathsf {DescPr}}(d,s,p,i); \, \varphi (e) =\{(\underline{d}\) rdfdl:hasProxy \(\xi (d)), \, (\xi (d) \, \underline{s}\):\(\underline{p} \, \underline{i})\},\) then for \((\xi (d) \, \underline{s}\):\(\underline{p} \, \underline{i}),\) the possible equivalent triples generated by \(\varphi \) are all the triples in the third column of Table 3 except itself. Because of the RDFDL constraint in Sect. 6.2, namely \(\underline{s}\):\(\underline{p} \notin \{\)rdf:type, rdfs:domain, rdfs:range, rdfs:subClassOf, rdfs:subPropertyOf, rdfdl:hasProxy, rdfdl:isPartOf, rdfdl:isDescriptionOf\(\};\) and for any description \(d\) and any identifier \(i_1\) or schema identifier \(i_2\) and fragment identifier \(i_3\) in \(I, \, \xi (d) \ne \underline{i_1}\) and \(\xi (d) \ne \underline{i_2}\):\(\underline{i_3},\) the functions \(\xi , \, \underline{\cdot }_s, \, \underline{\cdot }_f, \, \underline{\cdot }_{cr}, \, \underline{\cdot }_d\) and \(\underline{\cdot }\) are injective. Then from \(e\) we can get unique \(\varphi (e) =\{(\underline{d}\) rdfdl:hasProxy \(\xi (d)), \, (\xi (d) \, \underline{s}\):\(\underline{p} \, \underline{i})\}.\) In the same manner, we can prove if \(e = {\mathsf {PrInst}}(i,s,p,j),\) then from \(e\) we can get unique \(\varphi (e) =\{\underline{i} \, \underline{s}\):\(\underline{p} \, \underline{j})\}.\) \(\square \)

3) Third, we prove that for any interpretation \(I_1\) and \(I_2, \, I_1 \ne I_2 \hbox { implies } \Phi (I_1) \ne \Phi (I_2).\)

Proof

For any \(e\) in the second column of the Table 3, if \(I_1 \ne I_2,\) then either \(I_1 \setminus I_2 \ne \emptyset ,\) or \(I_2 \setminus I_1 \ne \emptyset .\) Without loss of generality we assume \(I_1 \setminus I_2\ne \emptyset .\) Then there exists \(e \in I_1\) such that \(e \not \in I_2.\)

Now let us prove there exists triple \(t \in \varphi (e)\) such that \(t \notin \Phi (I_2).\) From Table 3,

  • if \(e = {\mathsf {SchCl}}(s,c)\); then as \(\varphi \) is injective, so from \(e\) we can get unique \(\varphi (e) = \{(\underline{s}\):\(\underline{c}\) rdf:type rdfs:Class)\(\}.\) Therefore, if \(e \not \in I_1,\) then there exists \(t\) = (\(\underline{s}\):\(\underline{c}\) rdf:type rdfs:Class) \(\in \varphi (e)\) such that \(t \notin \Phi (I_2).\) In the same manner, we can prove if \(e = {\mathsf {SchPr}}(s,p),\) then there exists triple \(t = (\underline{s}\):\(\underline{p}\) rdf:type rdf:Property) \(\in \varphi (e)\) such that \(t \notin \Phi (I_2).\)

  • if \(e = {\mathsf {Dom}}(s,p,c);\) then as \(\varphi \) is injective, so from \(e\) we can get unique \(\varphi (e) = \{(\underline{s}\):\(\underline{p}\) rdfs:domain \(\underline{s}\):\(\underline{c})\}.\) Therefore, if \(e \not \in I_1,\) then there exists triple \(t = (\underline{s}\):\(\underline{p}\) rdfs:domain \(\underline{s}\):\(\underline{c}) \in \varphi (e)\) such that \(t \notin \Phi (I_2).\) In the same manner, we can prove that if \(e \, \in \, \{{\mathsf {Ran}}(s,p,c), \, {\mathsf {IsaCl}}(s, c_1, c_2), {\mathsf {IsaPr}}(s, p_1, p_2), \, {\mathsf {PartOf}}(i,j), \, {\mathsf {DescOf}}(d,i)\},\) then from \(e\) we can get unique \(\varphi (e)\), respectively. Therefore, if \(e \not \in I_1,\) there exists triple \(t \in \varphi (e)\) such that \(t \notin \Phi (I_2).\)

  • if \(e = {\mathsf {DescCl}}(d,s,c);\) then as \(\varphi \) is injective, so from \(e\) we can get unique \(\varphi (e) =\{(\underline{d}\) rdfdl:hasProxy \(\xi (d)), \, (\xi (d)\) rdf:type \(\underline{s}\):\(\underline{c})\}\). Therefore, if \(e \not \in I_1,\) then there exists triple \(t\) = (\(\xi (d)\) rdf:type \(\underline{s}\):\(\underline{c}\)) \(\in \varphi (e)\) such that \(t \notin \Phi (I_2).\) In the same manner, we can prove if \(e = {\mathsf {ClInst}}(i,s,c),\) then there exists triple \(t \in \varphi (e)\) such that \(t \notin \Phi (I_2).\)

  • if \(e = {\mathsf {DescPr}}(d,s,p,i);\) then as \(\varphi \) is injective, so from \(e\) we can get unique \(\varphi (e) =\{(\underline{d}\) rdfdl:hasProxy \(\xi (d)), \, (\xi (d) \, \underline{s}\):\(\underline{p} \, \underline{i})\}.\) Therefore, if \(e \not \in I_1,\) then there exists triple \(t\) = (\(\xi (d) \, \underline{s}\):\(\underline{p} \, \underline{i}\)) \(\in \varphi (e)\) such that \(t \notin \Phi (I_2).\) In the same manner, we can prove if \(e = {\mathsf {PrInst}}(i,s,p,j),\) then there exists triple \(t \in \varphi (e)\) such that \(t \notin \Phi (I_2).\)

Now we have, if \(e \in I_1\) and \(e \not \in I_2,\) then there exists triple \(t \in \varphi (e)\) such that \(t \in \Phi (I_1)\) and \(t \notin \Phi (I_2),\) namely \(\Phi (I_1) \ne \Phi (I_2).\) \(\square \)

Proposition 3

Let \(I\) be any interpretation of \({\fancyscript{L}}, \, \Phi , \, {\fancyscript{M}}\) and \({\fancyscript{R}}\) the functions defined above. Then, the diagram in Fig. 4 commutes, that is:

$$\begin{aligned} \Phi ({I^\star })={\fancyscript{R}}(\Phi (I)). \end{aligned}$$

We will do the proof in two stages. In the first stage we will prove soundness, i.e.

$$\begin{aligned} \Phi ({I^\star })\subseteq {\fancyscript{R}}(\Phi (I)). \end{aligned}$$

In the second stage we will prove completeness, i.e.

$$\begin{aligned} {\fancyscript{R}}(\Phi (I))\subseteq \Phi ({I^\star }). \end{aligned}$$

Let us focus on soundness. We must prove that the translation \(\Phi (\alpha )\) of any formula \(\alpha \) in \({I^\star }\) is a logical RDF consequence of the translation \(\Phi (I)\) of \(I.\) Now, \(\alpha \) is in \({I^\star }\) iff \(\alpha \in X^i\) for some \(i.\) We do the proof by induction on \(i.\)

Proof

For \(i=0, \, X^0=I.\) Since \(\Phi (I)\subseteq {\fancyscript{R}}(\Phi (I)),\) we have the proof for this case. Now let us assume that \(\Phi (X^n)\subseteq {\fancyscript{R}}(\Phi (I)),\) and let us prove that \(\Phi (X^{n+1})\subseteq {\fancyscript{R}}(\Phi (I)).\) Let \(\alpha \in X^{n+1}\setminus X^n.\) In this case \(\alpha \) is the logical consequence of some axiom in \({P_{\fancyscript{A}}}\) or \({P_{{\fancyscript{L}}_+}}.\) We must carry out the proof for each rule in \({P_{\fancyscript{A}}}\) or \({P_{{\fancyscript{L}}_+}}.\) Since the pattern is quite the same for each rule, we just show the proof for the rule \({\mathsf {SchPr}}(s,p) ~\mathtt {:\!\!-}~{\mathsf {Dom}}(s,p,c).\) The complete proof can be found in  [34].

If \(I = \{{\mathsf {Dom}}(s,p,c)\} \subseteq X^n,\) then \(\Phi (I) = \{\)(\(s\):\(p\) rdfs:domain \(s\):\(c\))\(\}\), as an RDF consequence,

\({\fancyscript{R}}(\Phi (I)) = \{\)(\(s\):\(p\) rdfs:domain \(s\):\(c\)), (\(s\):\(p\) rdf:type rdf:Property)\(\}.\) Therefore, \({I^\star }= {\fancyscript{M}}(I) = \{{\mathsf {SchPr}}(s,p), {\mathsf {Dom}}(s,p,c)\}, \,\Phi ({I^\star }) \, \subseteq \, {\fancyscript{R}}(\Phi (I)).\) \(\square \)

Now let us focus on completeness. We have to prove that:

$$\begin{aligned} {\fancyscript{R}}(\Phi (I))\subseteq \Phi ({I^\star }). \end{aligned}$$

That is, any triple that is inferred from \(\Phi (I)\) through the RDF deduction mechanism \({\fancyscript{R}}\) also ends up in the translation of the digital library \({I^\star },\) that is \(\Phi ({I^\star }).\) To this end, we will consider the triples produced by the function \(\Phi ,\) listed in the third column of Table 3 and, for each rule in the R set (see Sect. 6.3), we will consider which triples the rule would introduce in \({\fancyscript{R}}(\Phi (I)).\) We will then show that the same triples are also in \(\Phi ({I^\star }).\) Also in this case the proof is the application of the same pattern to all rules, therefore we will limit ourselves to three cases: the first rule in [27] and the two rules introduced in our calculus, that is RDFDL-1 and RDFDL-2. The complete proof can be found in [34].

To simplify notation, we will denote the inverse of the function \(\underline{\cdot }_s, \,\underline{\cdot }_f\) and \(\underline{\cdot }\) by using italics, that is, the constant symbol \(c\) whose translation in RDF \(\underline{c}\) is xxx will be denoted as \(xxx.\)

Proof

  1. 1.

    (2a) (\(A\), sp, \(B\)) (\(B\), sp, \(C\)) \(\rightarrow \) (\(A\), sp, \(C\)). If \(\Phi (I)\) contains triples \(\{\)(\(A\), sp, \(B\)), (\(B\), sp, \(C\))\(\}\), then \(\{(A,\) sp, \(C)\} \subseteq {\fancyscript{R}}(\Phi (I)).\) From Table 3, we can see that the triples in \(\Phi (I)\) can only be generated by \(\Phi \) with translation rule TR6 in the form of \(\{(s\):\(A\) rdfs:subPropertyOf \(s\):\(B\)), (\(s\):B rdfs:subPropertyOf \(s\):\(C)\},\) then \(\{{\mathsf {IsaPr}}(s,A,B), {\mathsf {IsaPr}}(s,B,C)\} \subseteq I\), by axiom (SI4), \(\{{\mathsf {IsaPr}}(s, \, A,C)\} \, \subseteq \, {I^\star }.\) By translation rule TR6, \(\{(s\):\(A\) rdfs:subPropertyOf \(s\):\(C)\} \subseteq \Phi ({I^\star }),\) namely \({\fancyscript{R}}(\Phi (I)) \subseteq \Phi ({I^\star }).\)

  2. 2.

    (RDFDL-1) (\(U\) rdfdl:isDescriptionOf \(X\)) (\(U\) rdfdl:hasProxy \(Z\)) (\(Z ~A ~Y\)) \(\rightarrow \, (X ~A ~Y).\) If \(\Phi (I)\) contains triples \(\{(U\) rdfdl:isDescriptionOf \(X\)), (\(U\) rdfdl:hasProxy \(Z\)), \((Z ~A ~Y)\},\) then \(\{(X ~A ~Y)\} \subseteq {\fancyscript{R}}(\Phi (I))\). From Table 3, the triples in \(\Phi (I)\) can only be generated by \(\Phi \) with translation rule TR10 and TR8 in the form of \(\{(U\) rdfdl:isDescriptionOf \(X\)), (\(U\) rdfdl:hasProxy \(\xi (U)\)), \((\xi (U) \, s\):\(A \, Y)\},\) then \(\{{\mathsf {DescOf}}(U,X), \, {\mathsf {DescPr}}(U,s,A,Y)\} \subseteq I,\) by axiom (Q3), \(\{{\mathsf {PrInst}}\, (X,s,A,Y)\} \subseteq {I^\star }.\) By translation rule TR12, \(\{(X \, s\):\(A \, Y)\} \subseteq \Phi ({I^\star }),\) namely \({\fancyscript{R}}(\Phi (I)) \, \subseteq \, \Phi ({I^\star }).\)

  3. 3.

    (RDFDL-2) (\(U\) rdfdl:isDescriptionOf \(X\)) (\(X\) rdfdl:isPartOf \(Y\)) \(\rightarrow \) (\(U\) rdfdl:isDescriptionOf \(Y\)). If \(\Phi (I)\) contains triples \(\{(U\) rdfdl:isDescriptionOf \(X\)), (\(X\) rdfdl:isPartOf \(Y)\},\) then \(\{(U\) rdfdl:isDescriptionOf \(Y)\} \subseteq {\fancyscript{R}}(\Phi (I))\). From Table 3, the triples in \(\Phi (I)\) can only be generated by \(\Phi \) with translation rule TR10 and TR9 in the form of \(\{(U\) rdfdl:isDescriptionOf \(X\)), (\(X\) rdfdl:isPartOf \(Y)\},\) then \(\{{\mathsf {DescOf}}(U, \, X), \, {\mathsf {PartOf}}(X,Y)\} \subseteq I,\) by axiom (D2), \(\{{\mathsf {DescOf}}(U,Y)\} \subseteq {I^\star }.\) By translation rule TR10, \(\{(U\) rdfdl:isDescriptionOf \(Y\} \subseteq \Phi ({I^\star }),\) namely \({\fancyscript{R}}(\Phi (I)) \subseteq \Phi ({I^\star }).\)

\(\square \)

Proposition 4

For every query \(\alpha \) in the query language \({\fancyscript{Q}}\) over a digital library \({\fancyscript{D}}, \, \Phi (ans(\alpha , \, {\fancyscript{D}^\star })) \, = ans_R(\Psi (\alpha ), \Phi (I^\star )).\)

First, we define the answering function for SPARQL, \(ans_R,\) following the approach in [28].

Let \(V\) be a infinite set of variables, \(U\) be a pairwise disjoint infinite set of IRIs, Blank nodes and Literals, and let mapping \(\mu \) from \(V\) to \(U\) be a partial function \(\mu :V \rightarrow U.\) For a triple pattern \(t,\) let \(\mu (t)\) be the triple obtained by replacing the variables in \(t\) according to \(\mu .\) The domain of \(\mu \) denoted as dom(\(\mu \)) is the subset of \(V\) where \(\mu \) is defined. Mappings \(\mu _1\) and \(\mu _2\) are compatible if for any \(?x \in \hbox {dom}(\mu _1) \cap \hbox {dom}(\mu _2),\) then \(\mu _1(?x) = \mu _2(?x),\) namely \(\mu _1 \cup \mu _2\) is also a mapping. In particular, var(\(t\)) denotes the set of variables occurring in the components of \(t.\)

Let \(\Omega _1, \,\Omega _2\) be multiset of mappings, the join, union between \(\Omega _1\) and \(\Omega _2\) are defined as:

$$\begin{aligned} \Omega _1 \bowtie \Omega _2&= \{\mu _1 \cup \mu _2 ~|~ \mu _1 \in \Omega _1, ~\mu _2 \in \Omega _2,\\&~ ~ \mu _1 \hbox {and} ~\mu _2~ \hbox {are compatible mappings}\},\\ \Omega _1 \cup \Omega _2&= \{\mu ~| ~\mu \in \Omega _1 ~\hbox {or} ~\mu \in \Omega _2\}. \end{aligned}$$

Given a mapping \(\mu \) and a built-in condition \(expr,\) we say that \(\mu \) satisfies \(expr,\) denoted by \(\mu \models expr,\) if:

  • \(expr\) is bound(?x) and ?x \(\in \) dom(\(\mu \));

  • \(expr\) is ?x = c, ?x \(\in \) dom(\(\mu \)) and \(\mu \)(?x) = c;

  • \(expr\) is ?x = ?y, ?x \(\in \) dom(\(\mu \)), ?y \(\in \) dom(\(\mu \)) and \(\mu \)(?x) = \(\mu \)(?y);

  • \(expr\) is (\(\lnot r\)), \(r\) is a built-in condition, and it is not the case that \(\mu \models r;\)

  • \(expr\) is (\(r_1 \vee r_2\)), \(r_1\) and \(r_2\) are built-in conditions, and \(\mu \models r_1\) or \(\mu \models r_2;\)

  • \(expr\) is (\(r_1 \wedge r_2\)), \(r_1\) and \(r_2\) are built-in conditions, \(\mu \models r_1\) and \(\mu \models r_2.\)

then:

$$\begin{aligned} \Omega _1 ~\hbox {FILTER} ~expr&= \{\mu \in \Omega _1 ~|~ \mu \models expr\}. \end{aligned}$$

We are ready to define the semantics of graph pattern expressions. We will not consider the OPTIONAL clause, as we do not use OPTIONAL in our query translation.

Definition 9

(Semantics of SPARQL graph pattern expressions) The evaluation of a graph pattern \(P\) over an RDF dataset \(D,\) denoted by \(ans_R(P,D),\) is recursively defined as:

  • If \(P\) is a triple pattern \(t,\) then \(ans_R(P,D) = \, \{\mu ~|~\hbox {dom}(\mu ) = \, \hbox {var}(t)\) and \(\mu (t) \in D\}.\)

  • If \(P\) is (\(P_1\) AND \(P_2\)), then \(ans_R(P,D) = \, ans_R(P_1,D) \bowtie ans_R(P_2,D).\)

  • If \(P\) is (\(P_1\) UNION \(P_2\)), then \(ans_R(P,D)=\! ans_R(P_1,D) \cup ans_R(P_2,D).\)

  • If \(P\) is (\(P_1\) FILTER \(expr\)), then \(ans_R(P,D)= \, \{\mu \in ans_R(P_1, \, D)~|~ \, \mu \models expr\}.\)

For more details about the definition of the semantics of SPARQL graph pattern expressions, the reader is referred to [28].

Now, let us prove the commutativity of Fig. 5. As already remarked above, we can focus on the right hand side of the diagram, showing that:

$$\begin{aligned} \Phi (ans(\alpha ,{\fancyscript{D}^\star }))=ans_R(\Psi (\alpha ), \Phi ({I^\star })) \end{aligned}$$

for every digital library \({\fancyscript{D}}\) and query \(\alpha \) in \({\fancyscript{Q}}.\) As already done for the commutativity of the left-hand side of the diagram, we will divide the proof in two parts: soundness (i.e.

\(\Phi (ans(\alpha ,{\fancyscript{D}^\star })) \subseteq ans_R(\Psi (\alpha ), \Phi ({I^\star }))\)) and completeness (i.e.

\(ans_R(\Psi (\alpha ), \Phi ({I^\star }))\subseteq \Phi (ans(\alpha ,{\fancyscript{D}^\star }))\)).

Lemma 4

(Soundness) For every \({\fancyscript{Q}}\) query \(\alpha \) and digital library \({\fancyscript{D}}, \, \Phi (ans(\alpha , {\fancyscript{D}^\star })) \subseteq ans_R(\Psi (\alpha ), \Phi (I^\star )).\)

Proof

We consider atomic queries first, \(P(t_1,\ldots ,t_n),\) where for generality every term \(t_i\) is a variable \(x_i.\) The proof for the cases in which some terms are identifiers are special cases of this one, and are omitted for brevity. The proof is done case by case, by considering all predicate symbols in \({\fancyscript{Q}}.\) The proof is very similar for each symbol, so we show only two cases, a simple one and a more complicated one, requiring the introduction of a FILTER clause. The complete proof can be found in  [34].

  1. 1.

    Let \(\alpha ={\mathsf {SchCl}}(x_1,x_2).\) Suppose in \({\fancyscript{D}}^\star \) there are exactly \(n\ge 0\) tuples \((s_1,c_1),\ldots , \, (s_n,c_n)\) in \({I^\star }({\mathsf {SchCl}}).\) As a consequence, \(ans(\alpha , \, {\fancyscript{D}}^\star ) = \{\langle s_1,c_1\rangle ,\ldots ,\langle s_n,c_n\rangle \},\) hence: \(\Phi (ans(\alpha , {\fancyscript{D}}^\star )) = \{\langle \mathrm {?x}_1\!\rightarrow \! \underline{s_1}, \mathrm {?x}_2\rightarrow \underline{c_1}\rangle , \ldots , \, \langle \mathrm {?x}_1\rightarrow \underline{s_n}, \, \mathrm {?x}_2\rightarrow \underline{c_n}\rangle \}.\) As another consequence, by applying \(\Phi \) to \({\fancyscript{D}}^\star ,\) we have \(\{(\underline{s_1}\):\(\underline{c_1}\) rdf:type rdfs:Class\(),\ldots , \, (\underline{s_n}\):\(\underline{c_n}\) rdf:type rdfs:Class)\(\} \, \subseteq \Phi ( \, I^\star ).\) Now, \(\Psi (\alpha )\) is given by the following SPARQL query:

    figure n

    By the definition of the SPARQL answering function, we have \(\{\langle \mathrm {?x}_1\rightarrow \underline{s_1}, \mathrm {?x}_2\rightarrow \underline{c_1}\rangle , \ldots , \langle \mathrm {?x}_1\rightarrow \underline{s_n}, \mathrm {?x}_2\rightarrow \underline{c_n}\rangle \}\subseteq ans_R(\Psi (\alpha ), \, \Phi (I^\star )).\) Hence \(\Phi (ans(\alpha ,{\fancyscript{D}}^\star )) \subseteq \, ans_R(\Psi (\alpha ),\Phi (I^\star )).\)

  2. 2.

    Let \(\alpha ={\mathsf {Dom}}(x_1,x_2,x_3)\). Suppose in \({\fancyscript{D}}^\star \) there are exactly \(n\ge 0\) tuples \(\{(s_1,p_1,c_1),\ldots , \, (s_n,p_n,c_n)\} \, \subseteq \, {\mathsf {Dom}}.\) As a consequence, \(ans\, (\alpha , \, {\fancyscript{D}}^\star ) = \{\langle {s_1,p_1,c_1}\rangle ,\ldots , \, \langle s_n, \, p_n,c_n\rangle \},\) hence \(\Phi (ans(\alpha , \, {\fancyscript{D}}^\star )) \, = \{\langle \mathrm {?x}_1 \rightarrow \underline{s_1}, \mathrm {?x}_2 \rightarrow \underline{p_1}, \mathrm {?x}_3 \rightarrow \underline{c_1}\rangle , \ldots ,\langle \mathrm {?x}_1 \rightarrow \underline{s_n}, \mathrm {?x}_2 \rightarrow \underline{p_n}, \mathrm {?x}_3 \rightarrow \underline{c_n}\rangle \}.\) As another consequence, by applying \(\Phi \) to \({\fancyscript{D}}^\star ,\) we have \(\{(\underline{s}\):\(\underline{p}\) rdfs:domain \(\underline{s}\):\(\underline{c_1}) \, ,\ldots , \, (\underline{s}\):\(\underline{p}\) rdfs:domain \(\underline{s}\):\(\underline{c_n})\} \, \subseteq \Phi (I^\star ).\) Now, \(\Psi (\alpha )\) is given by the following SPARQL query:

    figure o

    By the definition of the SPARQL answering function, we have \(\{\langle \mathrm {?x}_1 \rightarrow \underline{s_1}, \mathrm {?x}_2 \rightarrow \underline{p_1}, \mathrm {?x}_3 \rightarrow \underline{c_1} \rangle ,\ldots , \langle \mathrm {?x}_1 \rightarrow \underline{s_n}, \mathrm {?x}_2 \rightarrow \underline{p_n}, \mathrm {?x}_3 \rightarrow \underline{c_n} \rangle \} \subseteq ans_R(\Psi (\alpha , \, \Phi (I^\star )).\) Hence \(\Phi (ans(\alpha , \, {\fancyscript{D}}^\star )) \subseteq ans_R(\Psi (\alpha , \, \Phi (I^\star )).\)

The extension of the proof to the other types of queries is straightforward, given the correspondence between conjunction and disjunction in \({\fancyscript{Q}}\) with the “.” and the UNION operators in SPARQL, respectively, and the observation that in a SPARQL query all variables that do not appear in the SELECT clause are understood as existentially quantified. \(\square \)

Lemma 5

(Completeness) For every \({\fancyscript{Q}}\) query \(\alpha \) and digital library \({\fancyscript{D}}, \, ans_R(\Psi ( \alpha ),\Phi (I^\star ))\subseteq \Phi (ans(\alpha , {\fancyscript{D}})).\)

Proof

As for the previous Lemma, we just give the completeness proof for atomic queries having variables as terms, showing only two cases. The complete proof can be found in  [34].

  1. 1.

    Let \(\alpha ={\mathsf {SchCl}}(x_1,x_2)\) and \(\Phi (I^\star )\) be a given rdf dataset including exactly \(n\ge 0\) triples \((\underline{s_1}\):\(\underline{c_1}\) rdf:type rdfs:Class\(), \, \ldots , \, (\underline{s_n}\):\(\underline{c_n}\) rdf:type rdfs:Class). By definition, \(\Psi (\alpha )\) is given by:

    figure p

    By applying this query to \(\Phi (I^\star ),\) we have \(ans_R(\Psi (\alpha ), \, \Phi (I^\star )) = \{\langle \mathrm {?x_1}\rightarrow \underline{s_1}, \mathrm {?x_2}\rightarrow \underline{c_1}\rangle ,\ldots , \langle \mathrm {?x_1}\rightarrow \underline{s_n}, \mathrm {?x_2}\rightarrow \underline{c_n}\rangle \}.\) By inverse application of function \(\Phi \) to \(\Phi (I^\star ),\) we have \(\{(s_1,c_1),\ldots ,(s_n,c_n)\} \subseteq {I^\star }({\mathsf {SchCl}}).\) Applying \(\alpha ={\mathsf {SchCl}}(x_1,x_2)\) to \({\fancyscript{D}}^\star ,\) we have \(\{\langle s_1,c_1\rangle , \ldots ,\langle s_n,c_n\rangle \} \subseteq ans(\alpha ,{\fancyscript{D}^\star }).\) By the definition of \(\Phi , \{\langle \mathrm {?x_1}\rightarrow \underline{s_1}, \mathrm {?x_2} \, \rightarrow \underline{c_1}\rangle ,\ldots , \langle \mathrm {?x_1}\rightarrow \underline{s_n}, \mathrm {?x_2}\rightarrow \underline{c_n}\rangle \} \subseteq \Phi (ans(\alpha ,{\fancyscript{D}^\star })).\) Hence \(ans_R( \, \Psi ({\mathsf {SchCl}}(x_1,x_2)), \Phi (I^\star ))\subseteq \Phi (ans( {\mathsf {SchCl}}(x_1,x_2), \, {\fancyscript{D}^\star })).\)

  2. 2.

    Let \(\alpha ={\mathsf {Dom}}(x_1,x_2,x_3)\) and \(\Phi (I^\star )\) be a given rdf dataset including exactly \(n\ge 0\) triples \((\underline{s_1}\):\(\underline{p_1}\) rdfs:domain \(\underline{s_1}\):\(\underline{c_1}), \, \ldots , \, (\underline{s_n}\):\(\underline{p_n}\) rdfs:domain \(\underline{s_n}\):\(\underline{c_n}).\) By definition, \(\Psi (\alpha )\) is given by:

    figure q

    By applying this query to \(\Phi (I^\star ),\) we have \(ans_R(\Psi (\alpha ), \Phi (I^\star ))=\{\langle \mathrm {?x}_1\rightarrow \underline{s_1}, \mathrm {?x}_2\rightarrow \underline{p_1}, \mathrm {?x}_3\rightarrow \underline{c_1}\rangle , \ldots , \langle \mathrm {?x}_1\rightarrow \underline{s_n}, \mathrm {?x}_2\rightarrow \underline{p_n}, \mathrm {?x}_2\rightarrow \underline{c_n}\rangle \}.\) By inverse application of \(\Phi \) to \(\Phi (I^\star ),\) we have \(\{(s_1,p_1,c_1), \, \ldots ,(s_n,p_n,c_n)\} \, \subseteq \Phi ^\star ({\mathsf {Dom}}).\) Applying \(\alpha ={\mathsf {Dom}}(x_1,x_2,x_3)\) to \({\fancyscript{D}}^\star ,\) we have \(\{\langle \underline{s_1}, \underline{p_1}, \underline{c_1}\rangle , \, \ldots , \langle \underline{s_n}, \underline{p_n}, \underline{c_n}\rangle \}\subseteq ans(\alpha , \, {\fancyscript{D}}^\star ).\) By the definition of \(\Phi , \, \{\langle \mathrm {?x}_1\rightarrow \underline{s_1}, \, \mathrm {?x}_2 \, \rightarrow \underline{p_1}, \mathrm {?x}_3\rightarrow \underline{c_1}\rangle , \ldots , \langle \mathrm {?x}_1\rightarrow \underline{s_n}, \mathrm {?x}_2\rightarrow \underline{p_n}, \mathrm {?x}_3\rightarrow \underline{c_n}\rangle \}\subseteq \Phi (ans({\mathsf {Dom}}(x_1,x_2,x_3),{\fancyscript{D}}^\star )).\) Hence \(ans_R(\Psi ({\mathsf {Dom}}(x_1,x_2,x_3)), \, \Phi (I^\star )) \!\subseteq \! \Phi (ans({\mathsf {Dom}}(x_1,x_2, \, x_3), {\fancyscript{D}}^\star )).\) \(\square \)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Meghini, C., Spyratos, N., Sugibuchi, T. et al. A Model for Digital Libraries and its Translation to RDF. J Data Semant 3, 107–139 (2014). https://doi.org/10.1007/s13740-013-0029-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13740-013-0029-x

Keywords

Navigation