Skip to main content

Advertisement

Log in

MapMerge: correlating independent schema mappings

  • Special Issue Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

One of the main steps toward integration or exchange of data is to design the mappings that describe the (often complex) relationships between the source schemas or formats and the desired target schema. In this paper, we introduce a new operator, called MapMerge, that can be used to correlate multiple, independently designed schema mappings of smaller scope into larger schema mappings. This allows a more modular construction of complex mappings from various types of smaller mappings such as schema correspondences produced by a schema matcher or pre-existing mappings that were designed by either a human user or via mapping tools. In particular, the new operator also enables a new “divide-and-merge” paradigm for mapping creation, where the design is divided (on purpose) into smaller components that are easier to create and understand and where MapMerge is used to automatically generate a meaningful overall mapping. We describe our MapMerge algorithm and demonstrate the feasibility of our implementation on several real and synthetic mapping scenarios. In our experiments, we make use of a novel similarity measure between two database instances with different schemas that quantifies the preservation of data associations. We show experimentally that MapMerge improves the quality of the schema mappings, by significantly increasing the similarity between the input source instance and the generated target instance. Finally, we provide a new algorithm that combines MapMerge with schema mapping composition to correlate flows of schema mappings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Alexe, B., Gubanov, M., Hernández, M.A., Ho, H., Huang, J.W., Katsis, Y., Popa, L., Saha, B., Stanoi, I.: Simplifying information integration: object-based flow-of-mappings framework for integration. In: BIRTE, pp. 108–121. Springer, Berlin (2009)

  2. Alexe B., Hernández M.A., Popa L., Tan W.C.: MapMerge: correlating independent schema mappings. PVLDB 3(1), 81–92 (2010)

    Google Scholar 

  3. Beeri C., Vardi M.Y.: A proof procedure for data dependencies. JACM 31(4), 718–741 (1984)

    Article  MathSciNet  MATH  Google Scholar 

  4. Bonifati, A., Chang, E.Q., Ho, T., Lakshmanan, V.S., Pottinger, R.: HePToX: marrying XML and heterogeneity in your P2P databases. In: VLDB, pp. 1267–1270 (2005). http://www.vldb.org/conf/2005/papers/p1267-bonifati.pdf

  5. Dessloch, S., Hernández, M.A., Wisnesky, R., Radwan, A., Zhou, J.: Orchid: integrating schema mapping and ETL. In: ICDE, pp. 1307–1316 (2008). http://doi.ieeecomputersociety.org/10.1109/ICDE.2008.4497540

  6. Eiter T., Mannila H.: Distance measures for point sets and their computation. Acta Inform. 34(2), 109–133 (1997)

    Article  MathSciNet  Google Scholar 

  7. Fagin R., Haas L.M., Hernández M.A., Miller R.J., Popa L., Velegrakis Y.: Clio: schema mapping creation and data exchange. In: Borgida, A., Chaudhri, V.K., Giorgini, P., Yu, E.S.K. (eds) Conceptual Modeling: Foundations and Applications, pp. 198–236. Springer, Berlin (2009)

    Chapter  Google Scholar 

  8. Fagin R., Kolaitis P.G., Miller R.J., Popa L.: Data exchange: semantics and query answering. TCS 336(1), 89–124 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  9. Fagin R., Kolaitis P.G., Popa L., Tan W.: Composing schema mappings: second-order dependencies to the rescue. TODS 30(4), 994–1055 (2005)

    Article  Google Scholar 

  10. Fagin, R., Kolaitis, P.G., Popa, L., Tan, W.C.: Reverse data exchange: coping with nulls. In: PODS, pp. 23–32 (2009). http://doi.acm.org/10.1145/1559795.1559800

  11. Fuxman, A., Hernández, M.A., Ho, C.T.H., Miller, R.J., Papotti, P., Popa, L.: Nested mappings: schema mapping reloaded. In: VLDB, pp. 67–78 (2006). http://www.vldb.org/conf/2006/p67-fuxman.pdf

  12. Galindo-Legaria, C.A.: Outerjoins as disjunctions. In: SIGMOD Conference, pp. 348–358 (1994)

  13. Kolaitis, P.G.: Schema mappings, data exchange, and metadata management. In: PODS, pp. 61–75 (2005). http://doi.acm.org/10.1145/1065167.1065176

  14. Lenzerini, M.: Data integration: a theoretical perspective. In: PODS, pp. 233–246 (2002). http://doi.acm.org/10.1145/543613.543644, http://www.acm.org/sigs/sigmod/pods/proc02/papers/233-Lenzerini.pdf

  15. Madhavan, J., Halevy, A.Y.: Composing mappings among data sources. In: VLDB, pp. 572–583 (2003). http://www.vldb.org/conf/2003/papers/S18P01.pdf

  16. Maier D., Mendelzon A.O., Sagiv Y.: Testing implications of data dependencies. TODS 4(4), 455–469 (1979)

    Article  Google Scholar 

  17. Melnik, S., Bernstein, P.A., Halevy, A.Y., Rahm, E.: Supporting executable mappings in model management. In: SIGMOD, pp. 167–178 (2005). http://doi.acm.org/10.1145/1066157.1066177

  18. Nash, A., Bernstein, P.A., Melnik, S.: Composition of mappings given by embedded dependencies. ACM Trans. Database Syst. 32(1), 4 (2007). http://doi.acm.org/10.1145/1206049.1206053

    Google Scholar 

  19. Popa, L., Velegrakis, Y., Miller, R.J., Hernández, M.A., Fagin, R.: Translating web data. In: VLDB, pp. 598–609 (2002). http://www.vldb.org/conf/2002/S17P02.pdf

  20. Rahm E., Bernstein P.A.: A survey of approaches to automatic schema matching. VLDB J. 10(4), 334–350 (2001)

    Article  MATH  Google Scholar 

  21. Rajaraman, A., Ullman, J.D.: Integrating information by outerjoins and full disjunctions. In: PODS, pp. 238–248 (1996). http://doi.acm.org/10.1145/237661.237717

  22. Simitsis, A., Vassiliadis, P., Sellis, T.K.: Optimizing ETL processes in data warehouses. In: ICDE, pp. 564–575 (2005). http://doi.ieeecomputersociety.org/10.1109/ICDE.2005.103

  23. Vassiliadis, P., Simitsis, A., Skiadopoulos, S.: Modeling ETL activities as graphs. In: DMDW, pp. 52–61 (2002). http://SunSITE.Informatik.RWTH-Aachen.de/Publications/CEUR-WS/Vol-58/simitsis.pdf

  24. Velegrakis, Y., Miller, R.J., Popa, L.: Mapping adaptation under evolving schemas. In: VLDB, pp. 584–595 (2003). http://www.vldb.org/conf/2003/papers/S18P02.pdf

  25. Yu, C., Popa, L.: Semantic adaptation of schema mappings when schemas evolve. In: VLDB, pp. 1006–1017 (2005). http://www.vldb2005.org/program/paper/fri/p1006-yu.pdf

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bogdan Alexe.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Alexe, B., Hernández, M., Popa, L. et al. MapMerge: correlating independent schema mappings. The VLDB Journal 21, 191–211 (2012). https://doi.org/10.1007/s00778-012-0264-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-012-0264-z

Keywords

Navigation