Skip to main content

Detail and Context in Web Usage Mining: Coarsening and Visualizing Sequences

  • Conference paper
  • First Online:
WEBKDD 2001 — Mining Web Log Data Across All Customers Touch Points (WebKDD 2001)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2356))

Abstract

As Web sites begin to realize the advantages of engaging users in more extended interactions involving information and communication, the log files recording Web usage become more complex. While Web usage mining provides for the syntactic specification of structured patterns like association rules or (generalized) sequences, it is less clear how to analyze and visualize usage data involving longer patterns with little expected structure, without losing an overview of the whole of all paths. In this paper, concept hierarchies are used as a basic method of aggregating Web pages. Interval-based coarsening is then proposed as a method for representing sequences at different levels of abstraction. The tool STRATDYN that implements these methods uses χ2 testing and coarsened stratograms. Stratograms with uniform or differential coarsening provide various detail-and-context views of actual and intended Web usage. Relations to the measures support and confidence, and ways of analyzing generalized sequences are shown. A case study of agent-supported shopping in an E-commerce site illustrates the formalism.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Annacker, D., Spiekermann, S., & Strobel, M. (2001). Private consumer information: A new search cost dimension in online environments. In B. O’Keefe, C. Loebbecke, J. Gricar, A. Pucihar, & G. Lenart (Eds.), Proceedings of 14 th Bled Electronic Commerce Conference (pp. 292–308). Bled, Slovenia. June 2001.

    Google Scholar 

  2. Baumgarten, M., Büchner, A.G., Anand, S.S., Mulvenna, M.D.& Hughes, J.G. (2000). User-driven navigation pattern discovery from internet data. In [42] (pp. 74–91).

    Chapter  Google Scholar 

  3. Berendt, B. (2000). Web usage mining, site semantics, and the support of navigation. In [29] (pp. 83–93).

    Google Scholar 

  4. Berendt, B. (2002). Using site semantics to analyze, visualize, and support navigation. Data Mining and Knowledge Discovery, 6, 37–59.

    Article  MathSciNet  Google Scholar 

  5. Berendt, B. & Brenstein, E. (2001). Visualizing Individual Differences in Web Navigation: STRATDYN, a Tool for Analyzing Navigation Patterns. Behavior Research Methods, Instruments, & Computers, 33, 243–257.

    Google Scholar 

  6. Berendt, B. & Spiliopoulou, M. (2000). Analysis of navigation behaviour in web sites integrating multiple information systems. The VLDB Journal, 9, 56–75.

    Article  Google Scholar 

  7. Borges, J. & Levene, M. (2000). Data mining of user navigation patterns. In [42] (pp. 92–111).

    Chapter  Google Scholar 

  8. Brin, S., Motwani, R., & Silverstein, C. (1997). Beyond market baskets: Generalizing association rules to correlations. In ACM SIGMOD International Conference on Management of Data (pp. 265–276).

    Google Scholar 

  9. Card, S.K., Mackinlay, J.D., & Shneiderman, B. (1999). Information visualization. In S.K. Card, J.D. Mackinlay, & B. Shneiderman (Eds.), Readings in Information Visualization: Using Vision to Think (pp. 1–34). San Francisco, CA: Morgan Kaufmann.

    Google Scholar 

  10. Chen, C. (1999). Information Visualisation and Virtual Environments. London: Springer.

    Google Scholar 

  11. Chi, E.H. (1999). A Framework for Information Visualization Spreadsheets. University of Minnesota, Computer Science Department: Ph.D. Dissertation. http://www-users.cs.umn.edu/~echi/phd

  12. Chi, E.H., Pirolli, P., Chen, K., & Pitkow, J. (2001). Using information scent to model user information needs and actions on the Web. In Proceedings of ACM CHI 2001 Conference on Human Factors in Computing Systems (pp. 490–497). Amsterdam: ACM Press.

    Chapter  Google Scholar 

  13. Chi, E.H., Pirolli, P., & Pitkow, J. (2000). The scent of a site: a system for analyzing and predicting information scent, usage, and usability of a web site. In Proceedings of ACM CHI 2000 Conference on Human Factors in Computing Systems (pp. 161–168). Amsterdam: ACM Press.

    Chapter  Google Scholar 

  14. Cooley, R. (2000). Web Usage Mining: Discovery and Application of Interesting Patterns from Web Data. University of Minnesota, Faculty of the Graduate School: Ph.D. Dissertation. http://www.cs.umn.edu/research/websift/papers/-rwc_thesis.ps

  15. Cooley, R., Tan, P.-N., & Srivastava, J. (2000). Discovery of interesting usage patterns from web data. In [42] (pp. 163–182).

    Chapter  Google Scholar 

  16. Craven, M., DiPasquo, D., Freitag, D., McCallum, A., Mitchell, T., Nigam, K., & Slattery, S. (2000). Learning to Construct Knowledge Bases from the World Wide Web. Artificial Intelligence, 118, 69–113.

    Article  MATH  Google Scholar 

  17. Cugini, J., & Scholtz, J. (1999). VISVIP: 3D Visualization of Paths through Web Sites. In Proceedings of the International Workshop on Web-Based Information Visualization (WebVis’99) (pp. 259–263). Florence, Italy: IEEE Computer Society.

    Google Scholar 

  18. Eick, S.G. (2001). Visualizing online activity. Communications of the ACM, 44(8), 45–50.

    Article  Google Scholar 

  19. Fensel, D. (2000). Ontologies: Silver Bullet for Knowledge Management and Electronic Commerce. Barlin: Springer.

    Google Scholar 

  20. Fernández, M., Florescu, D., Levi, A., & Suciu, D. (2000). Declarative specification of Web sites with Strudel. The VLDB Journal, 9, 38–55.

    Article  Google Scholar 

  21. Fu, W.-T. (2001). ACT-PRO Action Protocol Analyzer: a tool for analyzing discrete action protocols. Behavior Research Methods, Instruments, & Computers, 33, 149–158.

    Google Scholar 

  22. Gaul, W., & Schmidt-Thieme, L. (2000). Mining web navigation path fragments. In [29] (pp. 105–110).

    Google Scholar 

  23. Han, J., & Kamber, M. (2001). Data Mining: Concepts and Techniques. San Francisco, LA: Morgan Kaufmann.

    Google Scholar 

  24. Hochheiser, H., & Shneiderman, B. (1999). Understanding Patterns of User Visits to Web Sites: Interactive Starfield Visualizations of WWW Log Data. College Park: University of Maryland, Technical Report TR99-3. http://www.isr.umd.edu/TechReports/ISR/1999/TR99-3/TR99-3.pdf

    Google Scholar 

  25. Hong, J.I., Heer, J., Waterson, S., & Landay, J.A. (in press). WebQuilt: A Proxy-based Approach to Remote Web Usability Testing. ACM Transactions on Information Systems. http://guir.berkeley.edu/projects/webquilt/pubs/acmTOIS-webquilt-final.pdf

  26. Hong, J., & Landay, J.A. (2001). WebQuilt: A Framework for Capturing and Visualizing the Web Experience. In Proceedings of The Tenth International World Wide Web Conference, Hong Kong, May 2001.

    Google Scholar 

  27. Jones, T. & Berger, C. (1995). Students’ use of multimedia science instruction: Designing for the MTV generation? Journal of Educational Multimedia and Hypermedia, 4, 305–320.

    Google Scholar 

  28. Kato, H., Nakayama, T., & Yamane, Y. (2000). Navigation analysis tool based on the correlation between contents distribution and access patterns. In [29] (pp. 95–104).

    Google Scholar 

  29. Kohavi, R., Spiliopoulou, M., Srivastava, J. & Masand, B. (Eds.) (2000). Working Notes of the Workshop “Web Mining for E-Commerce-Challenges and Opportunities.” 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Boston, MA. August 2000.

    Google Scholar 

  30. Lamping, J., Rao, R., & Pirolli, P. (1995). A focus+context technique based on hyperbolic geometry for visualizing large hierarchies. In Proceedings of ACM CHI 1995 Conference on Human Factors in Computing Systems (pp. 401–408). New York: ACM Press.

    Google Scholar 

  31. Mannila, H. & Toivonen, H. (1996). Discovering generalized episodes using minimal occurrences. In Proceedings of the 2nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 146–151).

    Google Scholar 

  32. Menascé, D.A., Almeida, V., Fonseca, R. & Mendes, M.A. (1999). A Methodology for Workload Characterization of E-commerce Sites In Proceedings of the ACM Conference on Electronic Commerce, Denver, CO, November 1999.

    Google Scholar 

  33. Mobasher, B., Cooley, R., & Srivastava, J. (2000). Automatic personalization based on web usage mining. Communications of the ACM, 43(8), 142–151.

    Article  Google Scholar 

  34. Nanopoulos, A., & Manolopoulos, Y. (2001). Mining patterns from graph traversals. Data and Knowledge Engineering, 37, 243–266.

    Article  MATH  Google Scholar 

  35. Niegemann, H.M. (2000, April). Analyzing processes of self-regulated hypermedia-supported learning: On the development of a log-file analysis procedure. Paper presented at the Annual Meeting of the American Educational Research Association, New Orleans, LA.

    Google Scholar 

  36. Oberlander, J., Cox, R., Monaghan, P., Stenning, K., and Tobin, R. (1996). Individual differences in proof structures following multimodal logic teaching. In Proceedings COGSCI’96 (pp. 201–206).

    Google Scholar 

  37. Olson, G.M., Herbsleb, J.D., & Rueter, H. (1994). Characterizing the sequential structure of interactive behaviors through statistical and grammatical techniques. Human-Computer Interaction, 9, 427–472.

    Article  Google Scholar 

  38. Schellhas, B., & Brenstein, E. (1998). Learning strategies in hypermedia learning environments. In T. Ottmann & I. Tomek (Eds.), Proceedings of ED-MEDIA and ED-TELEKOM 98: (pp. 1922–1923). Charlottesville, VA: Association for the Advancement of Computing in Education.

    Google Scholar 

  39. Spiekermann, S., Grossklags, J., & Berendt, B. (2001). E-privacy in 2nd generation E-Commerce: privacy preferences versus actual behavior. In Proceedings of the ACM Conference on Electronic Commerce (EC’01). Tampa, FL. October 2001.

    Google Scholar 

  40. Spiliopoulou, M. (1999). The laborious way from data mining to web mining. International Journal of Computer Systems, Science & Engineering, 14, 113–126.

    Google Scholar 

  41. Spiliopoulou, M. (2000). Web usage mining for site evaluation: Making a site better fit its users. Communications of the ACM, 43(8), 127–134.

    Article  Google Scholar 

  42. Spiliopoulou, M. and Masand, B. (Eds.) (2000). Advances in Web Usage Analysis and User Profiling. Barlin: Springer.

    Google Scholar 

  43. Spiliopoulou, M. & Pohle, C. (2001). Data Mining for Measuring and Improving the Success of Web Sites. Data Mining and Knowledge Discovery, 5, 85–14.

    Article  MATH  Google Scholar 

  44. Srikant, R. & Agrawal, R. (1996). Mining sequential patterns: Generalizations and performance improvements. In EDBT (pp. 3–17). Avignon, France, March 1996.

    Google Scholar 

  45. Srivastava, J. Cooley, R., Deshpande, M., & Tan, P.-N. (2000). Web usage mining: discovery and application of usage patterns from web data. SIGKDD Explorations, 1, 12–23.

    Article  Google Scholar 

  46. Wang, K. (1997). Discovering patterns from large and dynamic sequential data. Intelligent Information Systems, 9, 8–33.

    Google Scholar 

  47. Ware, C. (2000). Information Visualization. Perception for Design. San Diego,CA: Academic Press.

    Google Scholar 

  48. World Wide Web Committee Web Usage Characterization Activity. (1999). W3C Working Draft: Web Characterization Terminology & Definitions Sheet. www.w3.org/1999/05/WCA-terms/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Berendt, B. (2002). Detail and Context in Web Usage Mining: Coarsening and Visualizing Sequences. In: Kohavi, R., Masand, B.M., Spiliopoulou, M., Srivastava, J. (eds) WEBKDD 2001 — Mining Web Log Data Across All Customers Touch Points. WebKDD 2001. Lecture Notes in Computer Science(), vol 2356. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45640-6_1

Download citation

  • DOI: https://doi.org/10.1007/3-540-45640-6_1

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-43969-1

  • Online ISBN: 978-3-540-45640-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics