Skip to main content

Abstract

Various XML instances from different data sources can model the same object of the real world. Query processing or view definition over these sources demands instance integration. In this context, integration means to identify which data instances represent the same object of the real world, as well as to solve ambiguities of representation of this object. The entity identification problem in XML is more complex than in structured databases. XML data, as originally considered, necessarily do not have the identification notion of primary key or object identifier. Thus, it is necessary the adoption of a mechanism that identifies the instances at the moment of data integration. This paper presents a proposal for identifiers attribution to XML instances, based on the use of Skolem functions and XPath recommendation, as proposed by W3C.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ahmed, R. et al. The Pegasus Heterogeneous Multidatabase System. Computer, New York, v.24, n.12, p. 19–27, Dec. 1991.

    Article  Google Scholar 

  2. Albert, J. Data Integration in the Rodin Multidatabase System. In: International Conference On Cooperative Information Systems, 1., 1996. Papers. [S.l.:s.n.], 1996. p. 48–57.

    Google Scholar 

  3. Bradley, Neil. The XML Companion. 2nd ed. Harlow: Addison-Wesley, 2000.

    Google Scholar 

  4. Chatterjee, A. et al. Data Manipulation in Heterogeneous Databases. Sigmod Record, New York, v.2, n.4, p. 64–68, Dec. 1991.

    Article  Google Scholar 

  5. Dayal, U. Processing queries over generalized hierarchies in a multidatabase system. In: International Conference On Very Large Data Bases, 9., 1983, Florence, IT. Proceedings... Florence: VLDB Endowment, 1983. p. 342–353.

    Google Scholar 

  6. Demichiel, L. G. Resolving Database Incompatibility: An Approach to Performing Relational Operations over Mismatched Domains. IEEE Transactions on Knowledge and Data Engineering, New York, v.1, n.2, p. 485–493, Dec. 1989.

    Article  Google Scholar 

  7. Deutsch, A. et al. A query language for XML. Journal WWW8/ Computer Networks, [S.l.], v.31, n.11–16, p.1155–1169. Avaiable at: http://www.research.att.com/~mff/files/final.html>. Access: Apr. 24 th, 2002.

    Article  MathSciNet  Google Scholar 

  8. Gogolla, M. Identifying Objects by Declarative Queries. In: Chomicki, Jan; Saake, Gunter; Sernadas, Christina. The Role of Logics in Information Systems. [S.l.:s.n.], 1995. Dagstuhl-Seminar-Report, n. 12

    Google Scholar 

  9. Hein, J. Discrete Structures, Logic and Computability. [S.l.]: Jones&Bartlett Publishers, 1995. Preliminary Edition.

    Google Scholar 

  10. Hull, R. et al. ILOG: Declarative Creation and Manipulation of Object Identifiers. In: International Conference On Very Large Data Bases, 6., 1990, Brisbane, AU. Proceedings... Brisbane: VLDB Endowment, 1990. p. 455–468.

    Google Scholar 

  11. Kifer, M. et al. Querying Object-Oriented Databases. In: International Conference On Management Of Data, 1992, San Diego, California, USA. Proceedings... San Diego: ACM Sigmod, 1992, p. 393–402.

    Google Scholar 

  12. Liefke, H. et al. Efficient View Maintenance in XML Data Warehouses. [S.l.]: Department of Computer and Information Science, University of Pennsylvania. Available at: http://www.cis.upenn.edu/~liefke/papers/whax.ps.gz>. Access: Jan. 20 th, 2002.

  13. Lim, E. et al. Entity identification in database integration. In: International Conference On Data Engineering, 9., 1993, Viena, AU. Proceedings... Viena: [s.n.], 1993. p.294–301.

    Google Scholar 

  14. Lim, E. et al. Resolving attribute incompatibility in database integration: An evidential reasoning approach. In: International Conference On Data Engineering, 10., 1994, Houston, US. Proceedings... Houston: [s.n.], 1994. p.154–163.

    Google Scholar 

  15. Lim, E. et al. A Global Object Model for Accommodating Instance Heterougeneities. In: International Conference On Conceptual Modelling, 17., 1998, Singapore. Proceedings... Singapore: [s.n], 1998. p. 435–448.

    Google Scholar 

  16. Manolescu, I. et al. Agora: Living with XML and Relational. In: International Conference On Very Large Data Bases, 26., 2000, Cairo, EG. Proceedings... Cairo: VLDB Endowment, 2000. p. 623–626.

    Google Scholar 

  17. Papakonstantinou, Y. et al. Object Fusion in Mediator Systems. In: International Conference On Very Large Data Bases, 22., 1996, Bombay. Proceedings... Bombay: VLDB Endowment, 1996. p. 413–424.

    Google Scholar 

  18. Reddy, M. P. et al. A Methodology for Integration of Heterogeneous Databases. IEEE Transactions on Knowledge and Data Engineering, New York, v.6, n.6, p. 920–933, Dec. 1994.

    Article  Google Scholar 

  19. Saccol, D. B. Materializacao de Visoes XML. 2001. Master Thesis (Master Course in Computer Science). Instituto de Informática, Universidade Federal do Rio Grande do Sul, Porto Alegre.

    Google Scholar 

  20. Shoens, K. A. et al. The Rufus System: Information Organization for Semi-Structured Data. In: International Conference On Very Large Data Bases, 19., 1993, Dublin, IR. Proceedings... Dublin: VLDB Endowment, 1993. p. 97–107.

    Google Scholar 

  21. Tseng, F. S. et al. A Probabilistic Approach to Query Processing in Heterogeneous Database Systems. In: International Workshop On Research Issues On Data Engineering: Transaction And Query Processing, 2., 1992, Tempe, US. Proceedings... Tempe: [s.n.], 1992. p. 176–183.

    Google Scholar 

  22. Wang, Y.R. et al. The Inter-Database Instance Identification Problem in Integrating Autonomous Systems. In: International Conference on Data Engineering, 5, 1989, Los Angeles, US, Proceedings... Los Angeles [s.n], 1989, p. 46–55

    Google Scholar 

  23. Wiener, J.L. et al. The WHIPS prototype for Data Warehouse Creation and Maintenance. In: International Conference on Data Engineering, 13, 1997, Birmingham, UK. Proceedings...

    Google Scholar 

  24. Zhou, G. et al. Using Object Matching and Materialization to Integrate Heterogeneous Databases. In: International Conference on Cooperative Information Systems, 3, 1995, Viena, AU. Proceedings... 1995. p. 4–18.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

de Brum Saccol, D., Alberto Heuser, C. (2003). Integration of XML Data. In: Bressan, S., Lee, M.L., Chaudhri, A.B., Yu, J.X., Lacroix, Z. (eds) Efficiency and Effectiveness of XML Tools and Techniques and Data Integration over the Web. DIWeb EEXTT 2002 2002. Lecture Notes in Computer Science, vol 2590. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36556-7_5

Download citation

  • DOI: https://doi.org/10.1007/3-540-36556-7_5

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-00736-4

  • Online ISBN: 978-3-540-36556-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics