Skip to main content
Log in

Specifying OLAP Cubes on XML Data

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

On-Line Analytical Processing (OLAP) enables analysts to gain insight about data through fast and interactive access to a variety of possible views on information, organized in a dimensional model. The demand for data integration is rapidly becoming larger as more and more information sources appear in modern enterprises. In the data warehousing approach, selected information is extracted in advance and stored in a repository, yielding good query performance. However, in many situations a logical (rather than physical) integration of data is preferable. Previous web-based data integration efforts have focused almost exclusively on the logical level of data models, creating a need for techniques focused on the conceptual level. Also, previous integration techniques for web-based data have not addressed the special needs of OLAP tools such as handling dimensions with hierarchies. Extensible Markup Language (XML) is fast becoming the new standard for data representation and exchange on the World Wide Web. The rapid emergence of XML data on the web, e.g., business-to-business (B2B) e-commerce, is making it necessary for OLAP and other data analysis tools to handle XML data as well as traditional data formats.

Based on a real-world case study, this paper presents an approach to specification of OLAP DBs based on web data. Unlike previous work, this approach takes special OLAP issues such as dimension hierarchies and correct aggregation of data into account. Also, the approach works on the conceptual level, using Unified Modeling Language (UML) as a basis for so-called UML snowflake diagrams that precisely capture the multidimensional structure of the data. An integration architecture that allows the logical integration of XML and relational data sources for use by OLAP tools is also presented.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Abiteboul, S. (1997). Querying Semistructured Data. In Proceeding of the Sixth International Conference on Database Theory(pp. 1-18).

  • Abiteboul, S. et al. (1999). Tools for Data Translation and Integration. Data Engineering Bulletin, 22(1), 3-8.

    Google Scholar 

  • Bonifati, A. et al. (2000). Comparative Analyses of Five XML Query Languages. SIGMOD Record, 29(1), 68-79.

    Google Scholar 

  • Cattell, R. (2000). The Object Database Standard: ODMG 3.0.San Mateo, CA: Morgan-Kaufmann.

    Google Scholar 

  • Chamberlin, D. et al. (2000). Quilt: An XML Query Language for Heterogeneous Data Sources. In Proceedings of the Third International Workshop on the Web and Databases(pp. 53-62).

  • Computer Associates Corporation. (2001). ERwin Product Brochure. www.cai.com/products/alm/erwin/ erwin pd.pdf

  • Deutsch, A. et al. (1999). Storing Semistructured Data with STORED. In Proceedings of ACM SIGMOD Conference(pp 431-442).

  • Fernandez, M. F. et al. (2000). Declarative Specification of Web Sites with Strudel. VLDB Journal, 9(1), 38-55.

    Google Scholar 

  • Florescu, D. and Kossmann, D. (1999). Storing and Querying XML Data using and RDMBS. Data Engineeing Bulletin, 22(3), 27-34.

    Google Scholar 

  • Gamma, E. et al. (1995). Design Patterns. Reading, MA: Addison-Wesley.

    Google Scholar 

  • Garcia-Molina, H. et al. (1997). The TSIMMIS Approach to Mediation: Data Models and Languages. Journal of Intelligent Information Systems, 8(2), 117-132.

    Google Scholar 

  • Gray, J. et al. (1997). Data Cube: A Relational Aggregation Operator Generalizing Group-by, Cross-Tab, and Sub Totals, Data Mining and Knowledge Discovery, 1(1), 29-53.

    Google Scholar 

  • Hellerstein, J.M. et al. (1999). Independent, Open Enterprise Data Integration. Data Engineering Bulletin, 22(1), 43-49.

    Google Scholar 

  • Hyperion Corporation. (2001). Hyperion Essbase OLAP 6. www.hyperion.com/essbaseolap.cfm

  • Jensen, M.R., Møller, T.H., and Pedersen T.B. (2001a). Converting XML Data to UML Diagrams For Conceptual Data Integration. In Proceedings of the First International Workshop on Data Integration Over The Web(pp. 17-31).

  • Jensen, M.R.,Møller, T.H., and Pedersen, T.B. (2001b). Specifying OLAP Cubes on XML Data. Technical Report R-01-5003, Department of Computer Science, Aalborg University, 22 p.

  • Kimball, R. et al. (1998). The Data Warehouse Lifecycle Toolkit.New York: Wiley.

    Google Scholar 

  • Kimball, R. (1996). The Data Warehouse Toolkit.New York: Wiley.

    Google Scholar 

  • Lahiri, T. et al. (1999). Ozone: Integrating Structured and Semistructured Data. In Proceedings of the Seventh International Conference on Database Programming Languages(pp. 297-323).

  • Lenz, H. and Shoshani, A. (1997). Summarizability in OLAP and Statistical Databases. In Proceedings of the Ninth International Conference on Statistical and Scientific Database Management(pp. 39-48).

  • Melton, J. et al. (1995). Understanding the New SQL: A Complete Guide. San Mateo, CA: Morgan-Kaufmann.

    Google Scholar 

  • Microsoft Corporation. (2001). SQL Server 2000 Analysis Services White Paper. www.microsoft.com/sql/ evaluation/compare/analysisservicesWP.asp

  • Object Management Group. (2001). OMG Unified Modeling Language Specification 1.3.www.rational.com/uml/ resources/documentation/index.jsp

  • Oracle Corporation. (2001). Oracle Express OLAP. www.oracle.com/ip/analyze/warehouse/bus_intell/index.html

  • Pedersen,T.B. et al. (1999). Extending Practical Pre-Aggregation in On-Line Analytical Processing. In Proceedings of the Twenty-Fifth International Conference on Very Large Databases(pp. 663-674).

  • Pedersen, T.B. et al. (2000). Extending OLAP Querying to External Object Databases. In Proceedings of the Ninth International Conference on Information and Knowledge Management(pp. 405-413).

  • Pinnock, J. et al. (2000). Professional XML.Chicago, IL: Wrox Press.

    Google Scholar 

  • Rafanelli, M. et al. (1990). STORM: A Statistical Object Representation Model. In Proceedings of the Fifth Conference on Statistical and Scientific Database Management(pp. 14-29). Heidelberg, Germany: Springer Verlag, 1990.

    Google Scholar 

  • Roth, M.T. et al. (1996). The Garlie Project. In Proceedings of ACM SIGMOD Conference(p. 557). New York, NY.

  • Shanmugasundaram, J. et al. (1999). Relational Databases for QueryingXMLDocuments: Limitations and Opportunities. In Proceedings of the Twenty-Fifth International Conference on Very Large Databases(pp. 302-314).

  • Silicon Integration Initiative (SII). (2001). The Electronic Component Information Exchange QuickData Architecture.www.-si2.org/ecix/

  • Thomsen, E. et al. (1999). Microsoft OLAP Solutions. New York, NY: Wiley.

    Google Scholar 

  • Thomsen, E. (1997). OLAP Solutions: Building Multidimensional Information System.New York, NY: Wiley.

    Google Scholar 

  • World Wide Web Consortium (W3C) (2001a). Extensible Markup Language (XML) 1.0 (Second Edition), W3C Recommendation. www.w3.org/TR/2000/REC-xml-20001006.

  • World Wide Web Consortium (W3C) (2001b). XML Schema, W3C Candidate Recommendation. www.w3.org/ XML/Schema.html

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jensen, M.R., Møller, T.H. & Pedersen, T.B. Specifying OLAP Cubes on XML Data. Journal of Intelligent Information Systems 17, 255–280 (2001). https://doi.org/10.1023/A:1012814015209

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1012814015209

Navigation