Skip to main content

A Generalization-Based Approach to Clustering of Web Usage Sessions

  • Conference paper
  • First Online:
Web Usage Analysis and User Profiling (WebKDD 1999)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1836))

Included in the following conference series:

Abstract

The clustering of Web usage sessions based on the access patterns is studied. Access patterns of Web users are extracted from Web server log files, and then organized into sessions which represent episodes of interaction between the Web users and the Web server. Using attribute-oriented induction, the sessions are then generalized according to a page hierarchy which organizes pages based on their contents. These generalized sessions are finally clustered using a hierarchical clustering method. Our experiments on a large real data set show that the approach is efficient and practical for Web mining applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan. Automatic subspace clustering of high dimensional data for data mining applications. In Proc. ACM SIGMOD Int. Conf. on Management of Data, Seattle, Washington, 1998.

    Google Scholar 

  2. J. C. Bezdek and S. K. Pal. Fuzzy Models for Pattern Recognition. IEEE Press, 1992.

    Google Scholar 

  3. J. Borges and M. Levene. Mining association rules in hypertext databases. In Proc. 1998 Int’l Conf. on Data Mining and Knowledge Discovery (KDD’98), pages 149–153, August 1998.

    Google Scholar 

  4. A. Büchner and M. Mulvenna. Discovering internet marketing intelligence through online analytical web usage mining. SIGMOD Record, 27, 1998.

    Google Scholar 

  5. M.S. Chen, J.S. Park, and P.S. Yu. Efficient data mining for path traversal patterns in distributed systems. Proc. 1996 Int’l Conf. on Distributed Computing Systems, 385, May 1996.

    Google Scholar 

  6. R. Cooley, B. Mobasher, and J. Srivastava. Data preparation for mining world wide web browsing patterns. Journal of Knowledge and Information Systems, 1, 1999.

    Google Scholar 

  7. R. Cooley, B. Mobasher, and J. Srivastava. Web mining: Information and pattern discovery on the world wide web. In Proc. Int. Conf. on Tools with Artificial Intelligence, pages 558–567, Newport Beach, CA, 1999.

    Google Scholar 

  8. O. Etzioni. The world-wide web: Quangmire or gold mine? Communications of ACM, 39:65–68, 1996.

    Article  Google Scholar 

  9. J. Han, Y. Cai, and N. Cercone. Knowledge discovery in databases: An attributeoriented approach. In Proc. 18th Int. Conf. Very Large Data Bases, pages 547–559, Vancouver, Canada, August 1992.

    Google Scholar 

  10. J. Han and Y. Fu. Dynamic generation and refinement of concept hierarchies for knowledge discovery in databases. In Proc. AAAI’94 Workshop on Knowledge Discovery in Databases (KDD’94), pages 157–168, Seattle, WA, July 1994.

    Google Scholar 

  11. J. Han and Y. Fu. Exploration of the power of attribute-oriented induction in data mining. In U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, editors, Advances in Knowledge Discovery and Data Mining, pages 399–421. AAAI/MIT Press, 1996.

    Google Scholar 

  12. A. K. Jain and R. C. Dubes. Algorithms for Clustering Data. Printice Hall, 1988.

    Google Scholar 

  13. L. Kaufman and P. J. Rousseeuw. Finding Groups in Data: an Introduction to Cluster Analysis. John Wiley & Sons, 1990.

    Google Scholar 

  14. R. S. Michalski and R. Stepp. Automated construction of classifications: Conceptual clustering versus numerical taxonomy. IEEE Trans. Pattern Analysis and Machine Intelligence, 5:396–410, 1983.

    Article  Google Scholar 

  15. B. Mobasher, N. Jain, S. Han, and J. Srivastava. Web Mining: Pattern Discovery from World Wide Web Transcations. Technical Report, University of Minnesota, avialable at ftp://ftp.cs.umn.edu/users/kumar/webmining.ps., 1996.

  16. J. Moore, S. Han, D. Boley, M. Gini, R. Gross, K. Hastings, G. Karypis, V. Kumar, and B. Mobasher. Web Page Categorization and Feature Selection Using Association Rule and Principal Component Clustering. Workshop on Information Technologies and Systems, avialable at ftp://ftp.cs.umn.edu/users/kumar/webwits.ps., 1997.

  17. M. Perkowitz and O. Etzioni. Adaptive web pages: Automatically synthesizing web pages. In Proc. 15th National Conf. on Artificial Intelligence (AAAI/IAAI’98), pages 727–732, Madison, Wisconsin, July, 1998.

    Google Scholar 

  18. C. Shahabi, A. Z. Zarkesh, J. Adibi, and V. Shah. Knowledge discovery from users web-page navigation. In Proc. of 1997 Int. Workshop on Research Issues on Data Engineering (RIDE’97), Birmingham, England, April 1997.

    Google Scholar 

  19. M. Spiliopoulou and L. Faulstich. Wum: A web utilization miner. In Proc. EDBT Workshop WebDB’98, Valencia, Spain, 1998.

    Google Scholar 

  20. A. Woodru., P. M. Aoki, E. Brewer, P. Gauthier, and L. A. Rowe. An Investigation of Documents from the World Wide Web. 5th Int. World Wide Web Conference, Paris, France, May, 1996.

    Google Scholar 

  21. T. W. Yan, M. Jacobsen, H. Garcia-Molina, and U. Dayal. From User Access Patterns to Dynamic Hypertext Linking. 5th Int. World Wide Web Conference, Paris, France, May, 1996.

    Google Scholar 

  22. O. R. Zaïane, X. Xin, and J. Han. Discovering web access patterns and trends by applying olap and data mining technology on web logs. In Proc. Advances in Digital Libraries, pages 19–29, 1998.

    Google Scholar 

  23. O. Zamir, O. Etzioni, O. Madani, and R. Karp. Fast and intuitive clustering of web documents. In Proc. Int’l Conf. on Data Mining and Knowledge Discovery (KDD’97), pages 287–290, Newport Beach, CA, August 1997.

    Google Scholar 

  24. T. Zhang, R. Ramakrishnan, and M. Livny. BIRCH: an efficient data clustering method for very large databases. In Proc. 1996 ACM-SIGMOD Int. Conf. Management of Data, pages 103–114, Montreal, Canada, June 1996.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Fu, Y., Sandhu, K., Shih, MY. (2000). A Generalization-Based Approach to Clustering of Web Usage Sessions. In: Masand, B., Spiliopoulou, M. (eds) Web Usage Analysis and User Profiling. WebKDD 1999. Lecture Notes in Computer Science(), vol 1836. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44934-5_2

Download citation

  • DOI: https://doi.org/10.1007/3-540-44934-5_2

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-67818-2

  • Online ISBN: 978-3-540-44934-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics