Abstract
Various XML instances from different data sources can model the same object of the real world. Query processing or view definition over these sources demands instance integration. In this context, integration means to identify which data instances represent the same object of the real world, as well as to solve ambiguities of representation of this object. The entity identification problem in XML is more complex than in structured databases. XML data, as originally considered, necessarily do not have the identification notion of primary key or object identifier. Thus, it is necessary the adoption of a mechanism that identifies the instances at the moment of data integration. This paper presents a proposal for identifiers attribution to XML instances, based on the use of Skolem functions and XPath recommendation, as proposed by W3C.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ahmed, R. et al. The Pegasus Heterogeneous Multidatabase System. Computer, New York, v.24, n.12, p. 19–27, Dec. 1991.
Albert, J. Data Integration in the Rodin Multidatabase System. In: International Conference On Cooperative Information Systems, 1., 1996. Papers. [S.l.:s.n.], 1996. p. 48–57.
Bradley, Neil. The XML Companion. 2nd ed. Harlow: Addison-Wesley, 2000.
Chatterjee, A. et al. Data Manipulation in Heterogeneous Databases. Sigmod Record, New York, v.2, n.4, p. 64–68, Dec. 1991.
Dayal, U. Processing queries over generalized hierarchies in a multidatabase system. In: International Conference On Very Large Data Bases, 9., 1983, Florence, IT. Proceedings... Florence: VLDB Endowment, 1983. p. 342–353.
Demichiel, L. G. Resolving Database Incompatibility: An Approach to Performing Relational Operations over Mismatched Domains. IEEE Transactions on Knowledge and Data Engineering, New York, v.1, n.2, p. 485–493, Dec. 1989.
Deutsch, A. et al. A query language for XML. Journal WWW8/ Computer Networks, [S.l.], v.31, n.11–16, p.1155–1169. Avaiable at: http://www.research.att.com/~mff/files/final.html>. Access: Apr. 24 th, 2002.
Gogolla, M. Identifying Objects by Declarative Queries. In: Chomicki, Jan; Saake, Gunter; Sernadas, Christina. The Role of Logics in Information Systems. [S.l.:s.n.], 1995. Dagstuhl-Seminar-Report, n. 12
Hein, J. Discrete Structures, Logic and Computability. [S.l.]: Jones&Bartlett Publishers, 1995. Preliminary Edition.
Hull, R. et al. ILOG: Declarative Creation and Manipulation of Object Identifiers. In: International Conference On Very Large Data Bases, 6., 1990, Brisbane, AU. Proceedings... Brisbane: VLDB Endowment, 1990. p. 455–468.
Kifer, M. et al. Querying Object-Oriented Databases. In: International Conference On Management Of Data, 1992, San Diego, California, USA. Proceedings... San Diego: ACM Sigmod, 1992, p. 393–402.
Liefke, H. et al. Efficient View Maintenance in XML Data Warehouses. [S.l.]: Department of Computer and Information Science, University of Pennsylvania. Available at: http://www.cis.upenn.edu/~liefke/papers/whax.ps.gz>. Access: Jan. 20 th, 2002.
Lim, E. et al. Entity identification in database integration. In: International Conference On Data Engineering, 9., 1993, Viena, AU. Proceedings... Viena: [s.n.], 1993. p.294–301.
Lim, E. et al. Resolving attribute incompatibility in database integration: An evidential reasoning approach. In: International Conference On Data Engineering, 10., 1994, Houston, US. Proceedings... Houston: [s.n.], 1994. p.154–163.
Lim, E. et al. A Global Object Model for Accommodating Instance Heterougeneities. In: International Conference On Conceptual Modelling, 17., 1998, Singapore. Proceedings... Singapore: [s.n], 1998. p. 435–448.
Manolescu, I. et al. Agora: Living with XML and Relational. In: International Conference On Very Large Data Bases, 26., 2000, Cairo, EG. Proceedings... Cairo: VLDB Endowment, 2000. p. 623–626.
Papakonstantinou, Y. et al. Object Fusion in Mediator Systems. In: International Conference On Very Large Data Bases, 22., 1996, Bombay. Proceedings... Bombay: VLDB Endowment, 1996. p. 413–424.
Reddy, M. P. et al. A Methodology for Integration of Heterogeneous Databases. IEEE Transactions on Knowledge and Data Engineering, New York, v.6, n.6, p. 920–933, Dec. 1994.
Saccol, D. B. Materializacao de Visoes XML. 2001. Master Thesis (Master Course in Computer Science). Instituto de Informática, Universidade Federal do Rio Grande do Sul, Porto Alegre.
Shoens, K. A. et al. The Rufus System: Information Organization for Semi-Structured Data. In: International Conference On Very Large Data Bases, 19., 1993, Dublin, IR. Proceedings... Dublin: VLDB Endowment, 1993. p. 97–107.
Tseng, F. S. et al. A Probabilistic Approach to Query Processing in Heterogeneous Database Systems. In: International Workshop On Research Issues On Data Engineering: Transaction And Query Processing, 2., 1992, Tempe, US. Proceedings... Tempe: [s.n.], 1992. p. 176–183.
Wang, Y.R. et al. The Inter-Database Instance Identification Problem in Integrating Autonomous Systems. In: International Conference on Data Engineering, 5, 1989, Los Angeles, US, Proceedings... Los Angeles [s.n], 1989, p. 46–55
Wiener, J.L. et al. The WHIPS prototype for Data Warehouse Creation and Maintenance. In: International Conference on Data Engineering, 13, 1997, Birmingham, UK. Proceedings...
Zhou, G. et al. Using Object Matching and Materialization to Integrate Heterogeneous Databases. In: International Conference on Cooperative Information Systems, 3, 1995, Viena, AU. Proceedings... 1995. p. 4–18.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
de Brum Saccol, D., Alberto Heuser, C. (2003). Integration of XML Data. In: Bressan, S., Lee, M.L., Chaudhri, A.B., Yu, J.X., Lacroix, Z. (eds) Efficiency and Effectiveness of XML Tools and Techniques and Data Integration over the Web. DIWeb EEXTT 2002 2002. Lecture Notes in Computer Science, vol 2590. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36556-7_5
Download citation
DOI: https://doi.org/10.1007/3-540-36556-7_5
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-00736-4
Online ISBN: 978-3-540-36556-3
eBook Packages: Springer Book Archive