Abstract
To integrate information, data in different formats, from dif- ferent, potentially overlapping sources, must be related and transformed to meet the users’ needs. Ten years ago, Clio introduced nonprocedural schema mappings to describe the relationship between data in heteroge- neous schemas. This enabled powerful tools for mapping discovery and integration code generation, greatly simplifying the integration process. However, further progress is needed. We see an opportunity to raise the level of abstraction further, to encompass both data- and schema-centric integration tasks and to isolate applications from the details of how the integration is accomplished. Holistic information integration supports it- eration across the various integration tasks, leveraging information about both schema and data to improve the integrated result. Integration inde- pendence allows applications to be independent of how, when, and where information integration takes place, making materialization and the tim- ing of transformations an optimization decision that is transparent to applications. In this paper, we define these two important goals, and propose leveraging data mappings to create a framework that supports both data- and schema-level integration tasks.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
An, Y., Borgida, A., Miller, R.J., Mylopoulos, J.: A Semantic Approach to Discovering Schema Mapping Expressions. In: IEEE ICDE Conf., pp. 206–215 (2007)
Alexe, B., Chiticariu, L., Miller, R.J., Tan, W.-C.: Muse: Mapping Understanding and deSign by Example. In: IEEE ICDE Conf., pp. 10–19 (2008)
Alexe, B., Tan, W.-C., Velegrakis, Y.: STBenchmark: towards a benchmark for mapping systems. In: Proceedings of the VLDB Endowment, vol. 1, pp. 230–244 (2008)
Bonifati, A., Mecca, G., Pappalardo, A., Raunich, S., Summa, G.: Schema Mapping Verification: The Spicy Way. In: EDBT Conf., pp. 85–96 (2008)
Beeri, C., Vardi, M.Y.: A Proof Procedure for Data Dependencies. Journal of the ACM 31(4), 718–741 (1984)
Chiticariu, L., Tan, W.-C.: Debugging Schema Mappings with Routes. In: VLDB Conf., pp. 79–90 (2006)
Fagin, R., Haas, L.M., Hernández, M., Miller, R.J., Popa, L., Velegrakis, Y.: Clio: Schema Mapping Creation and Data Exchange. In: Borgida, A.T., Chaudhri, V.K., Giorgini, P., Yu, E.S. (eds.) Conceptual Modeling: Foundations and Applications, Essays in Honor of John Mylopoulos. LNCS, vol. 5600. Springer, Heidelberg (2009)
Fagin, R., Kolaitis, P.G., Miller, R.J., Popa, L.: Data Exchange: Semantics and Query Answering. In: Calvanese, D., Lenzerini, M., Motwani, R. (eds.) ICDT 2003. LNCS, vol. 2572, pp. 207–224. Springer, Heidelberg (2002); Extended version of ICDT 2003
Haas, L.M.: Beauty and the Beast: The Theory and Practice of Information Integration. In: Schwentick, T., Suciu, D. (eds.) ICDT 2007. LNCS, vol. 4353, pp. 28–43. Springer, Heidelberg (2006)
Hentschel, M., Kossmann, D., Florescu, D., Haas, L., Kraska, T., Miller, R.J.: Scalable Data Integration by Mapping Data to Queries. Technical Report 633, ETH Zurich, Systems Group, Dept. of Computer Science (2009)
Hernández, M.A., Miller, R.J., Haas, L.M.: Clio: A Semi-Automatic Tool For Schema Mapping. In: ACM SIGMOD Conf., p. 607 (2001); System Demonstration
Hassanzadeh, O., Xin, R., Kementsietsidis, A., Lim, L., Miller, R.J., Wang, M.: Linkage Query Writer. In: VLDB Conf. (2009); System Demonstration
Kementsietsidis, A., Arenas, M.: Data Sharing Through Query Translation in Autonomous Sources. In: VLDB Conf., pp. 468–479 (2004)
Kementsietsidis, A., Arenas, M., Miller, R.J.: Mapping Data in Peer-to-Peer Systems: Semantics and Algorithmic Issues. In: ACM SIGMOD Conf., vol. 32(2), pp. 325–336 (2003)
Miller, R.J., Haas, L.M., Hernández, M.: Schema Mapping as Query Discovery. In: VLDB Conf., pp. 77–88 (2000)
Popa, L., Velegrakis, Y., Miller, R.J., Hernández, M.A., Fagin, R.: Translating Web Data. In: VLDB Conf., pp. 598–609 (2002)
Rahm, E., Bernstein, P.A.: A Survey of Approaches to Automatic Schema Matching. The VLDB Journal 10, 334–350 (2001)
Raffio, A., Braga, D., Ceri, S., Papotti, P., Hernández, M.A.: Clip: a Visual Language for Explicit Schema Mappings. In: IEEE ICDE Conf., pp. 30–39 (2008)
Udrea, O., Getoor, L., Miller, R.J.: Leveraging Data and Structure in Ontology Integration. In: ACM SIGMOD Conf., pp. 449–460 (2007)
Yan, L.L., Miller, R.J., Haas, L., Fagin, R.: Data-Driven Understanding and Refinement of Schema Mappings. In: ACM SIGMOD Conf., vol. 30(2), pp. 485–496 (2001)
Yu, C., Popa, L.: Constraint-Based XML Query Rewriting For Data Integration. In: ACM SIGMOD Conf., vol. 33(2), pp. 371–382 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Haas, L.M., Hentschel, M., Kossmann, D., Miller, R.J. (2009). Schema AND Data: A Holistic Approach to Mapping, Resolution and Fusion in Information Integration. In: Laender, A.H.F., Castano, S., Dayal, U., Casati, F., de Oliveira, J.P.M. (eds) Conceptual Modeling - ER 2009. ER 2009. Lecture Notes in Computer Science, vol 5829. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04840-1_5
Download citation
DOI: https://doi.org/10.1007/978-3-642-04840-1_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04839-5
Online ISBN: 978-3-642-04840-1
eBook Packages: Computer ScienceComputer Science (R0)