Skip to main content
Log in

Tableaux-based optimization of schema mappings for data integration

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

The task of combining data residing at different sources to provide the user a unified view is known as data integration. Schema mappings act as glue between the global schema and the source schemas of a data integration system. Global-and-local-as-view (GLAV) is one the approaches for specifying the schema mappings. Tableaux are used for expressing queries and functional dependencies on a single database. We investigate a general technique for expressing a GLAV mapping by a tabular structure called mapping assertion tableaux (MAT). In a similar way, we also express the tuple generating dependency (tgd) and equality generating dependency (egd) constraints by tabular forms, called tabular tgd (TTGD) and tabular egd (TEGD), respectively. A set consisting of the MATs, TTGDs and TEGDs are called schema mapping tableaux (SMT). We present algorithms that use SMT as operator on an instance of the source schema to produce an instance of the target schema. We show that the target instances computed by the SMT are ‘minimal’ and ‘most general’ in nature. We also define the notion of equivalence between the schema mappings of two data integration systems and present algorithms that optimize schema mappings through the manipulation of the SMT.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  • Abiteboul, S., & Duschka, O. (1998). Complexity of answering queries using materialized views. In PODS (pp. 254–265).

  • Aho, A. V., Sagiv, Y., & Ulman, J. D. (1979a). Efficient optimization of a class of relational expressions. TODS, 4(4), 435–454.

    Article  Google Scholar 

  • Aho, A. V., Sagiv, Y., Ulman, J. D. (1979b). Equivalences among relational expressions. SIAM Journal of Computing, 8(2), 218–246.

    Article  MATH  Google Scholar 

  • Beeri, C., & Vardi, M. Y. (1984). A proof procedure for data dependencies. Journal of the Association for Computing Machinery, 31(4), 718–741.

    Article  MathSciNet  MATH  Google Scholar 

  • Bernstein, P. A. (2003). Applying model management to classical meta-data problems. In CIDR (pp. 209–220).

  • Bertossi, L., & Bravo, L. (2005). Consistent query answers in virtual data integration system. In Inconsistency tolerance (pp. 42–83). New York: Springer.

    Chapter  Google Scholar 

  • Cali, A., & Torlone, R. (2009). Checking containment of schema mappings. In AMW09.

  • Calvanese, D., Giacomo, G. D., Lenzerini, M., & Vardi, M. Y. (2000). What is query rewriting? In Proceedings of the 4th international workshop on Cooperative Information Agents IV, The future of information agents in cyberspace (CIA ’00) (pp. 51–59). Springer-Verlag.

  • Doan, A., & Halevy, A. (2002). Efficiently ordering query plans for data integration. In ICDE (p. 383).

  • Duschka, O. M., Genesereth, M. R., & Levy, A. Y. (1999). Recursive query plans for data integration. Journal of Logic Programming, 43, 2000.

    MathSciNet  Google Scholar 

  • Fagin, R., Kolaitis, P. G., Miller, R. J., & Popa, L. (2003). Data exchange: Semantics and query answering. ICDT.

  • Fagin, R., Kolaitis, P. G., Nash, A., & Popa, L. (2008). Towards a theory of schema-mapping optimization. In PODS.

  • Fagin, R., Kolaitis, P. G., Popa, L., & Tan, W. C. (2004). Composing schema mappings: Second order dependencies to rescue. In PODS (pp. 83–94).

  • Halevy, A. Y. (2000). Theory of answering queries using views. SIGMOD Record, 29(4), 30.

    Article  Google Scholar 

  • Halevy, A. Y. (2001). Answering queries using views: A survey. VLDB Journal.

  • Lenzerini, M. (2002). Data integration: A theoretical perspective. In TODS (pp. 233–246).

  • Madhavan, J., & Halevy, A. Y. (2003). Composing mappings among data sources. In VLDB (pp. 572–583).

  • Maier, D., Mendelzon, A. O., & Sagiv, Y. (1979). Testing implications of data dependencies. TODS, 4(4), 455–469.

    Article  Google Scholar 

  • Millstein, T., Halevy, A., & Friedman, M. (2003). Query containment for data integration systems. Journal of Computer and System Sciences, 66, 20–39.

    Article  MathSciNet  MATH  Google Scholar 

  • Nash, A., Bernstein, P. A., & Melnik, S. (2005). Composition of mappings given by embedded dependencies. In PODS (pp. 172–183).

  • Pottinger, R. A., & Bernstein, P. A. (2003). Merging models based on given correspondences. In VLDB (pp. 826–873).

  • Pottinger, R. A., & Halevy, A. (2001). Minicon: A scalable algorithm for answering queries using views. VLDB Journal, 10, 182–198.

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mehedi Masud.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rahman, M.A., Masud, M., Kiringa, I. et al. Tableaux-based optimization of schema mappings for data integration. J Intell Inf Syst 38, 533–554 (2012). https://doi.org/10.1007/s10844-011-0166-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10844-011-0166-3

Keywords

Navigation