Skip to main content
Log in

Schema mapping and query translation in heterogeneous P2P XML databases

The VLDB Journal Aims and scope Submit manuscript

Abstract

Peers in a peer-to-peer data management system often have heterogeneous schemas and no mediated global schema. To translate queries across peers, we assume each peer provides correspondences between its schema and a small number of other peer schemas. We focus on query reformulation in the presence of heterogeneous XML schemas, including data–metadata conflicts. We develop an algorithm for inferring precise mapping rules from informal schema correspondences. We define the semantics of query answering in this setting and develop query translation algorithm. Our translation handles an expressive fragment of XQuery and works both along and against the direction of mapping rules. We describe the HePToX heterogeneous P2P XML data management system which incorporates our results. We report the results of extensive experiments on HePToX on both synthetic and real datasets. We demonstrate our system utility and scalability on different P2P distributions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

  1. Alexe B., Tan W.C., Velegrakis Y.: Stbenchmark: towards a benchmark for mapping systems. PVLDB 1(1), 230–244 (2008)

    Google Scholar 

  2. Altova XMLSpy: http://www.altova.com (2009)

  3. Amer-Yahia, S., Cho, S., Lakshmanan, L., Srivastava, D.: Minimization of tree pattern queries. In: SIGMOD, pp. 497–508 (2001)

  4. Andrews, A.J., Lakshmanan, L.V.S., Shiri, N., Subramanian, I.N.: On implementing schemaLog—a database programming language. In: CIKM, pp. 309–316 (1996)

  5. Arenas M., Kantere V., Kementsietsidis A., Kiringa I., Miller R., Mylopoulos J.: The hyperion project: from data integration to data coordination. SIGMOD Rec. 32(3), 53–58 (2003)

    Article  Google Scholar 

  6. Arenas, M., Libkin, L.: XML data exchange: consistency and query answering. In: PODS, pp. 13–24 (2005)

  7. Benedikt, M., Chan, C., Fan, W., Freire, J., Rastogi, R.: Capturing both types and constraints in data integration. In: SIGMOD, pp. 277–288 (2003)

  8. Bernstein, P.A., Giunchiglia, F., Kementsietsidis, A., Mylopoulos, J., Serafini, L., Zaihrayeu, I.: Data management for peer-to-peer computing: a vision. In: WebDB, pp. 89–94 (2002)

  9. Bohannon, P., Elnahrawy, E., Fan, W., Flaster, M.: Putting context into schema matching. In: VLDB, pp. 307–318 (2006)

  10. Bohannon, P., Fan, W., Flaster, M., Narayan, P.: Information preserving XML schema embedding. In: VLDB, pp. 85–96 (2005)

  11. Bonifati, A., Chang, E., Ho, T., Lakshmanan, L.V.S., Pottinger, R.: HEPTOX: marrying XML and heterogeneity in your P2P databases. In: VLDB, pp. 1267–1270 (2005)

  12. Calvanese, D., Giacomo, G.D., Lenzerini, M., Rosati, R.: Logical foundations of peer-to-peer data integration. In: PODS, pp. 241–251 (2004)

  13. Chalupsky, H.: Ontomorph: a translation system for symbolic knowledge. In: KR, pp. 471–482 (2000)

  14. Deutsch, A., Tannen, V.: Reformulation of XML queries and constraints. In: ICDT, pp. 225–241 (2003)

  15. Emulab. http://www.emulab.net

  16. Rahm E., Bernstein P.A.: A survey of approaches to automatic schema matching. VLDB J. 10(4), 334–350 (2001)

    Article  MATH  Google Scholar 

  17. Fagin, R.: Inverting schema mappings. In: PODS, pp. 50–59 (2006)

  18. Fagin, R., Kolaitis, P.G., Popa, L., Tan, W.C.: Composing schema mappings: second-order dependencies to the rescue. In: PODS, pp. 83–94 (2004)

  19. Fuxman, A., Hernández, M.A., Howard, C.T., Miller, R.J., Papotti, P., Popa, L.: Nested mappings: schema mapping reloaded. In: VLDB (2006)

  20. Halevy, A.Y., Ives, Z.G., Suciu, D., Tatarinov, I.: Schema mediation in peer data management systems. In: ICDE, pp. 505–516 (2003)

  21. Halevy, A.Y., Ives, Z.G., Mork, P., Tatarinov, I.: Piazza: data management infrastructure for semantic web applications. In: WWW, pp. 556–567 (2003)

  22. HepApp: http://staff.icar.cnr.it/angela/VLDBJappendix.pdf

  23. HepTox: http://www.cs.ubc.ca/labs/db/heptox/exp.htm

  24. Hernández M.A., Papotti P., Tan W.C.: Data exchange with data–metadata translations. PVLDB 1(1), 260–273 (2008)

    Google Scholar 

  25. Hull, R., Yoshikawa, M.: ILOG: declarative creation and manipulation of object identifiers. In: VLDB, pp. 455–468 (1990)

  26. Ives Z.G., Green T.J., Karvounarakis G., Taylor N.E., Tannen V., Talukdar P.P., Jacob M., Pereira F.: The orchestra collaborative data sharing system. SIGMOD Rec 37(3), 26–32 (2008)

    Article  Google Scholar 

  27. Kalfoglou Y., Schorlemmer M.: Ontology mapping: the state of the art. Knowl Eng Rev 18(1), 1–31 (2003)

    Article  Google Scholar 

  28. Kementsietsidis, A., Arenas, M., Miller, R.: Mapping data in peer-to-peer systems: semantics and algorithmic issues. In: SIGMOD, pp. 325–336 (2003)

  29. Levy, A.Y., Mendelzon, A., Sagiv, Y., Srivastava, D.: Answering queries using views. In: PODS, pp. 95–104 (1995)

  30. Madhavan, J., Bernstein, P.A., Rahm, E.: Generic schema matching with cupid. In: VLDB, pp. 49–58 (2001)

  31. Melnik, S., Garcia-Molina, H., Rahm, E.: Similarity flooding: a versatile graph matching algorithm and its application to schema matching. In: ICDE, pp. 117–128 (2002)

  32. Miller, R.J., Haas, L.M., Hernández, M.A.: Schema mapping as query discovery. In: VLDB, pp. 77–88 (2000)

  33. Ng, W.S., Ooi, B., Tan, K., Zhou, A.: PeerDB: a P2P-based System for distributed data sharing. In: ICDE, pp. 633–644 (2003)

  34. Noy, N., Musen, M.: Prompt: Algorithm and tool for automated ontology merging and alignment. In: AAAI, pp. 450–455 (2000)

  35. Papakonstantinou, Y., Abiteboul, S., Garcia-Molina, H.: Object fusion in mediator systems. In: VLDB, pp. 413–424 (1996)

  36. Pastry: http://research.microsoft.com/~antr/Pastry/

  37. Popa, L., Velegrakis, Y., Miller, R.J., Hernández, M.A., Fagin, R.: Translating web data. In: VLDB, pp. 598–609 (2002)

  38. Pottinger, R., Bernstein, P.A.: Merging models based on given correspondences. In: VLDB, pp. 826–873 (2003)

  39. Pottinger R., Halevy A.: MiniCon: a scalable algorithm for answering queries using views. VLDB J. 10(2–3), 182–198 (2001)

    MATH  Google Scholar 

  40. Qizx: http://www.xfra.net/qizxopen/

  41. Schmidt, A., Waas, F., Kersten, M., Carey, M., Manolescu, I., Busse, R.: XMark: a benchmark for XML data management. In: VLDB, pp. 974–985 (2002)

  42. STBenchmark: http://www.stbenchmark.org

  43. Stumme, G., Maedche, A.: FCA-MERGE: bottom-up merging of ontologies. In: IJCAI, pp. 225–230 (2001)

  44. Tatarinov, I., Halevy, A.: Efficient query reformulation in peer-data management systems. In: SIGMOD, pp. 539–550 (2004)

  45. Ullman, J.: Principles of Database and Knowledge-Base Systems. Computer Science Press (1988)

  46. Yu, C., Popa, L.: Constraint-based XML query rewriting for data integration. In: SIGMOD, pp. 371–382 (2004)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Angela Bonifati.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bonifati, A., Chang, E., Ho, T. et al. Schema mapping and query translation in heterogeneous P2P XML databases. The VLDB Journal 19, 231–256 (2010). https://doi.org/10.1007/s00778-009-0159-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-009-0159-9

Keywords

Navigation