Skip to main content

Integrating XML Sources into a Data Warehouse

  • Conference paper
Data Engineering Issues in E-Commerce and Services (DEECS 2006)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4055))

Abstract

Since XML has become a standard for data exchange over the Internet, especially in B2B and B2C communication, there is an increasing need of integrating XML data into data warehousing systems. In this paper we propose a methodology for data warehouse design, when data sources are XML Schemas and conforming XML documents. Particular relevance is given to the conceptual and logical multidimensional design. A prototype tool has been developed to verify and support our methodology. Because of the semi-structured nature of XML data, not all the information needed for design can be safely derived from XML Schema. In these situations, XQuery statements are generated by the tool to examine XML documents. The functionality of the tool is explained on a real-life XML Schema that describes purchase orders.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Florescu, D., Kossmann, D.: Storing and Querying XML Data Using an RDBMS. IEEE Data Engineering Bulletin 22(3) (1999)

    Google Scholar 

  2. Bohannon, P., Freire, J., Roy, P., Simeon, J.: From XML Schema to Relations: A Cost-Based Approach to XML Storage. In: Proc. of Int’l. Conf. on Data Engineering (ICDE 2002), San Jose, USA (2002)

    Google Scholar 

  3. Golfarelli, M., Rizzi, S., Vrdoljak, B.: Data Warehouse Design from XML Sources. In: Proc. Int. Workshop on Data Warehousing and OLAP (DOLAP 2001), pp. 40–47. ACM Press, New York (2001)

    Chapter  Google Scholar 

  4. Vrdoljak, B., Banek, M., Rizzi, S.: Designing Web Warehouses from XML Schemas. In: Kambayashi, Y., Mohania, M., Wöß, W. (eds.) DaWaK 2003. LNCS, vol. 2737, pp. 89–98. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  5. Jensen, M.R., Møller, T.H., Pedersen, T.B.: Converting XML Data to UML Diagrams for Conceptual Data Integration. In: Int. Workshop Data Integration over the Web (DIWeb 2001), pp. 17–31 (2001)

    Google Scholar 

  6. Jensen, R.M., Møller, T.H., Pedersen, T.B.: Specifying OLAP Cubes on XML Data. J. Intelligent Information Systems 17(2-3), 255–280 (2001)

    Article  MATH  Google Scholar 

  7. Pedersen, D., Riis, K., Pedersen, T.B.: XML Extended OLAP Querying. In: Proc. Int. Conf. on Scientific and Statistical Database Management (SSDBM 2002), pp. 195–206. IEEE Computer Society Press, Los Alamitos (2002)

    Chapter  Google Scholar 

  8. Li, Y., An, A.: Representing UML Snowflake Diagram from Integrating XML Data Using XML Schema. In: Proc. Int. Workshop on Data Engineering Issues in E-Commerce (DEEC 2005), pp. 103–111. IEEE Computer Society Press, Los Alamitos (2005)

    Chapter  Google Scholar 

  9. Open Applications Group (OAG), OAG Integration Specification (OAGIS), Release 7.2.1, http://www.openapplications.org/downloads/oagidownloads.htm

  10. Park, B.-K., Han, H., Song, I.-Y.: XML-OLAP: A Multidimensional Analysis Framework for XML Warehouses. In: Tjoa, A.M., Trujillo, J. (eds.) DaWaK 2005. LNCS, vol. 3589, pp. 32–42. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  11. Golfarelli, M., Maio, D., Rizzi, S.: Conceptual design of data warehouses from E/R schemes. In: Proc. Hawaii Int. Conf. on System Sciences (HICSS), vol. VII, pp. 334–343 (1998)

    Google Scholar 

  12. Golfarelli, M., Rizzi, S.: Designing the Data Warehouse: Key Steps and Crucial Issues. J. of Computer Science and Information Management 2(3), 1–14 (1999)

    Google Scholar 

  13. Golfarelli, M., Maio, D., Rizzi, S.: The Dimensional Fact Model: a Conceptual Model for Data Warehouses. Int. J. of Cooperative Information Systems 7(2-3), 215–247 (1998)

    Article  Google Scholar 

  14. Shanmugasundaram, J., Tufte, K., Zhang, C., He, G., DeWitt, D.J., Naughton, J.F.: Relational Databases for Querying XML Documents: Limitations and Opportunities. In: Proc. Very Large Data Bases Conf (VLDB 1999), pp. 302–314. Morgan Kaufmann, San Francisco (1999)

    Google Scholar 

  15. Kimball, R., Ross, M.: The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling. John Wiley & Sons, New York (2002)

    Google Scholar 

  16. World Wide Web Consortium (W3C): XML Schema Part 0: Primer Second Edition (W3C Recommendation, as of October 28, 2004), http://www.w3.org/TR/2004/REC-xmlschema-0-20041028/

  17. World Wide Web Consortium (W3C): XQuery 1.0: An XML Query Language (W3C Candidate Recommendation, as of November 3, 2005), http://www.w3.org/TR/2005/CR-xquery-20051103/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Vrdoljak, B., Banek, M., Skočir, Z. (2006). Integrating XML Sources into a Data Warehouse. In: Lee, J., Shim, J., Lee, Sg., Bussler, C., Shim, S. (eds) Data Engineering Issues in E-Commerce and Services. DEECS 2006. Lecture Notes in Computer Science, vol 4055. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11780397_11

Download citation

  • DOI: https://doi.org/10.1007/11780397_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-35440-6

  • Online ISBN: 978-3-540-35441-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics