Skip to main content

An OLAP-based Scalable Web Access Analysis Engine

  • Conference paper
  • First Online:
Data Warehousing and Knowledge Discovery (DaWaK 2000)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1874))

Included in the following conference series:

Abstract

Collecting and mining web log records (WLRs) from e-commerce web sites has become increasingly important for targeted marketing, promotions, and traffic analysis. In this paper, we describe a scalable data warehousing and OLAP-based engine for analyzing WLRs. We have to address several scalability and performance challenges in developing such a framework. Because an active web site may generate hundreds of millions of WLRs daily, we have to deal with huge data volumes and data flow rates. To support fine-grained analysis, e.g., individual users’ access profiles, we end up with huge, sparse data cubes defined over very large-sized dimensions (there may be hundreds of thousands of visitors to the site and tens of thousands of pages). While OLAP servers store sparse cubes quite efficiently, rolling up a very large cube can take prohibitively long. We have applied several non-traditional approaches to deal with this problem, which allow us to speed up WLR analysis by 3 orders of magnitude. Our framework supports multilevel and multidimensional pattern extraction, analysis and feature ranking, and in addition to the typical OLAP operations, supports data mining operations such as extended multilevel and multidimensional association rules.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Sameet Agarwal, Rakesh Agrawal, Prasad Deshpande, Ashish Gupta, Jeffrey F. Naughton, Raghu Ramakrishnan, Sunita Sarawagi, “On the Computation of Multidimensional Aggregates”, 506–521, Proc. VLDB’96, 1996.

    Google Scholar 

  2. Torben Bach Pedersen, Christian S. Jensen, Curtis E. Dyreson, “Extending Practical Pre-Aggregation in On-line Analytical Processing”, 663–674, Proc. VLDB’99, 1999.

    Google Scholar 

  3. Stefano Ceri, Piero Fraternali, Stefano Paraboschi, “Data-Driven, One-To-One Web Site Generation for Data-Intensive Applications”, 615–626, Proc. VLDB’99, 1999.

    Google Scholar 

  4. Surajit Chaudhuri and Umesh Dayal, “An Overview of Data Warehousing and OLAP Technology”, SIGMOD Record Vol (26) No (1), 1996.

    Google Scholar 

  5. Q. Chen, M. Hsu and U. Dayal, “A Data Warehouse/OLAP Framework for Scalable Telecommunication Tandem Traffic Analysis”, Proc. of 16th International Conference on Data Engineering (ICDE-2000), 2000, USA.

    Google Scholar 

  6. Q. Chen, U. Dayal, M. Hsu, “A Distributed OLAP Infrastructure for E-Commerce”, Proc. Fourth IFCIS Conference on Cooperative Information Systems (CoopIS’99), 1999, UK.

    Google Scholar 

  7. Daniela Florescu, Alon Y. Levy, Dan Suciu, Khaled Yagoub, “Optimization of Run-time Management of Data Intensive Web-sites”, 627–638, Proc. VLDB’99, 1999.

    Google Scholar 

  8. Dimitrios Gunopulos, George Kollios, Vassilis Tsotras, Carlotta Domeniconi, “Approximating multi-dimensional aggregate range queries overreal attributes”, Proc. ACMSIGMOD’00, 2000.

    Google Scholar 

  9. J. Han, S. Chee, and J. Y. Chiang, “Issues for On-Line Analytical Mining of Data Warehouses”, SIGMOD’98 Workshop on Research Issues on Data Mining and Knowledge Discovery (DMKD’98), USA, 1998.

    Google Scholar 

  10. H. V. Jagadish, Laks V. S. Lakshmanan, Divesh Srivastava, What can Hierarchies do for Data Warehouses? 530–541, Proc. VLDB’99, 1999.

    Google Scholar 

  11. S. Ravi Kumar, Prabhakar Raghavan, Sridhar Rajagopalan, Andrew Tomkins, “Extracting Large-Scale Knowledge Bases from the Web”, 639–650, Proc. VLDB’99, 1999.

    Google Scholar 

  12. Net.Genesis http://www.netgenesis.com.

  13. WebTrends, http://www.webt rends.com.

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chen, Q., Dayal, U., Hsu, M. (2000). An OLAP-based Scalable Web Access Analysis Engine. In: Kambayashi, Y., Mohania, M., Tjoa, A.M. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2000. Lecture Notes in Computer Science, vol 1874. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44466-1_21

Download citation

  • DOI: https://doi.org/10.1007/3-540-44466-1_21

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-67980-6

  • Online ISBN: 978-3-540-44466-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics