Abstract
As Web sites begin to realize the advantages of engaging users in more extended interactions involving information and communication, the log files recording Web usage become more complex. While Web usage mining provides for the syntactic specification of structured patterns like association rules or (generalized) sequences, it is less clear how to analyze and visualize usage data involving longer patterns with little expected structure, without losing an overview of the whole of all paths. In this paper, concept hierarchies are used as a basic method of aggregating Web pages. Interval-based coarsening is then proposed as a method for representing sequences at different levels of abstraction. The tool STRATDYN that implements these methods uses χ2 testing and coarsened stratograms. Stratograms with uniform or differential coarsening provide various detail-and-context views of actual and intended Web usage. Relations to the measures support and confidence, and ways of analyzing generalized sequences are shown. A case study of agent-supported shopping in an E-commerce site illustrates the formalism.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Annacker, D., Spiekermann, S., & Strobel, M. (2001). Private consumer information: A new search cost dimension in online environments. In B. O’Keefe, C. Loebbecke, J. Gricar, A. Pucihar, & G. Lenart (Eds.), Proceedings of 14 th Bled Electronic Commerce Conference (pp. 292–308). Bled, Slovenia. June 2001.
Baumgarten, M., Büchner, A.G., Anand, S.S., Mulvenna, M.D.& Hughes, J.G. (2000). User-driven navigation pattern discovery from internet data. In [42] (pp. 74–91).
Berendt, B. (2000). Web usage mining, site semantics, and the support of navigation. In [29] (pp. 83–93).
Berendt, B. (2002). Using site semantics to analyze, visualize, and support navigation. Data Mining and Knowledge Discovery, 6, 37–59.
Berendt, B. & Brenstein, E. (2001). Visualizing Individual Differences in Web Navigation: STRATDYN, a Tool for Analyzing Navigation Patterns. Behavior Research Methods, Instruments, & Computers, 33, 243–257.
Berendt, B. & Spiliopoulou, M. (2000). Analysis of navigation behaviour in web sites integrating multiple information systems. The VLDB Journal, 9, 56–75.
Borges, J. & Levene, M. (2000). Data mining of user navigation patterns. In [42] (pp. 92–111).
Brin, S., Motwani, R., & Silverstein, C. (1997). Beyond market baskets: Generalizing association rules to correlations. In ACM SIGMOD International Conference on Management of Data (pp. 265–276).
Card, S.K., Mackinlay, J.D., & Shneiderman, B. (1999). Information visualization. In S.K. Card, J.D. Mackinlay, & B. Shneiderman (Eds.), Readings in Information Visualization: Using Vision to Think (pp. 1–34). San Francisco, CA: Morgan Kaufmann.
Chen, C. (1999). Information Visualisation and Virtual Environments. London: Springer.
Chi, E.H. (1999). A Framework for Information Visualization Spreadsheets. University of Minnesota, Computer Science Department: Ph.D. Dissertation. http://www-users.cs.umn.edu/~echi/phd
Chi, E.H., Pirolli, P., Chen, K., & Pitkow, J. (2001). Using information scent to model user information needs and actions on the Web. In Proceedings of ACM CHI 2001 Conference on Human Factors in Computing Systems (pp. 490–497). Amsterdam: ACM Press.
Chi, E.H., Pirolli, P., & Pitkow, J. (2000). The scent of a site: a system for analyzing and predicting information scent, usage, and usability of a web site. In Proceedings of ACM CHI 2000 Conference on Human Factors in Computing Systems (pp. 161–168). Amsterdam: ACM Press.
Cooley, R. (2000). Web Usage Mining: Discovery and Application of Interesting Patterns from Web Data. University of Minnesota, Faculty of the Graduate School: Ph.D. Dissertation. http://www.cs.umn.edu/research/websift/papers/-rwc_thesis.ps
Cooley, R., Tan, P.-N., & Srivastava, J. (2000). Discovery of interesting usage patterns from web data. In [42] (pp. 163–182).
Craven, M., DiPasquo, D., Freitag, D., McCallum, A., Mitchell, T., Nigam, K., & Slattery, S. (2000). Learning to Construct Knowledge Bases from the World Wide Web. Artificial Intelligence, 118, 69–113.
Cugini, J., & Scholtz, J. (1999). VISVIP: 3D Visualization of Paths through Web Sites. In Proceedings of the International Workshop on Web-Based Information Visualization (WebVis’99) (pp. 259–263). Florence, Italy: IEEE Computer Society.
Eick, S.G. (2001). Visualizing online activity. Communications of the ACM, 44(8), 45–50.
Fensel, D. (2000). Ontologies: Silver Bullet for Knowledge Management and Electronic Commerce. Barlin: Springer.
Fernández, M., Florescu, D., Levi, A., & Suciu, D. (2000). Declarative specification of Web sites with Strudel. The VLDB Journal, 9, 38–55.
Fu, W.-T. (2001). ACT-PRO Action Protocol Analyzer: a tool for analyzing discrete action protocols. Behavior Research Methods, Instruments, & Computers, 33, 149–158.
Gaul, W., & Schmidt-Thieme, L. (2000). Mining web navigation path fragments. In [29] (pp. 105–110).
Han, J., & Kamber, M. (2001). Data Mining: Concepts and Techniques. San Francisco, LA: Morgan Kaufmann.
Hochheiser, H., & Shneiderman, B. (1999). Understanding Patterns of User Visits to Web Sites: Interactive Starfield Visualizations of WWW Log Data. College Park: University of Maryland, Technical Report TR99-3. http://www.isr.umd.edu/TechReports/ISR/1999/TR99-3/TR99-3.pdf
Hong, J.I., Heer, J., Waterson, S., & Landay, J.A. (in press). WebQuilt: A Proxy-based Approach to Remote Web Usability Testing. ACM Transactions on Information Systems. http://guir.berkeley.edu/projects/webquilt/pubs/acmTOIS-webquilt-final.pdf
Hong, J., & Landay, J.A. (2001). WebQuilt: A Framework for Capturing and Visualizing the Web Experience. In Proceedings of The Tenth International World Wide Web Conference, Hong Kong, May 2001.
Jones, T. & Berger, C. (1995). Students’ use of multimedia science instruction: Designing for the MTV generation? Journal of Educational Multimedia and Hypermedia, 4, 305–320.
Kato, H., Nakayama, T., & Yamane, Y. (2000). Navigation analysis tool based on the correlation between contents distribution and access patterns. In [29] (pp. 95–104).
Kohavi, R., Spiliopoulou, M., Srivastava, J. & Masand, B. (Eds.) (2000). Working Notes of the Workshop “Web Mining for E-Commerce-Challenges and Opportunities.” 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Boston, MA. August 2000.
Lamping, J., Rao, R., & Pirolli, P. (1995). A focus+context technique based on hyperbolic geometry for visualizing large hierarchies. In Proceedings of ACM CHI 1995 Conference on Human Factors in Computing Systems (pp. 401–408). New York: ACM Press.
Mannila, H. & Toivonen, H. (1996). Discovering generalized episodes using minimal occurrences. In Proceedings of the 2nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 146–151).
Menascé, D.A., Almeida, V., Fonseca, R. & Mendes, M.A. (1999). A Methodology for Workload Characterization of E-commerce Sites In Proceedings of the ACM Conference on Electronic Commerce, Denver, CO, November 1999.
Mobasher, B., Cooley, R., & Srivastava, J. (2000). Automatic personalization based on web usage mining. Communications of the ACM, 43(8), 142–151.
Nanopoulos, A., & Manolopoulos, Y. (2001). Mining patterns from graph traversals. Data and Knowledge Engineering, 37, 243–266.
Niegemann, H.M. (2000, April). Analyzing processes of self-regulated hypermedia-supported learning: On the development of a log-file analysis procedure. Paper presented at the Annual Meeting of the American Educational Research Association, New Orleans, LA.
Oberlander, J., Cox, R., Monaghan, P., Stenning, K., and Tobin, R. (1996). Individual differences in proof structures following multimodal logic teaching. In Proceedings COGSCI’96 (pp. 201–206).
Olson, G.M., Herbsleb, J.D., & Rueter, H. (1994). Characterizing the sequential structure of interactive behaviors through statistical and grammatical techniques. Human-Computer Interaction, 9, 427–472.
Schellhas, B., & Brenstein, E. (1998). Learning strategies in hypermedia learning environments. In T. Ottmann & I. Tomek (Eds.), Proceedings of ED-MEDIA and ED-TELEKOM 98: (pp. 1922–1923). Charlottesville, VA: Association for the Advancement of Computing in Education.
Spiekermann, S., Grossklags, J., & Berendt, B. (2001). E-privacy in 2nd generation E-Commerce: privacy preferences versus actual behavior. In Proceedings of the ACM Conference on Electronic Commerce (EC’01). Tampa, FL. October 2001.
Spiliopoulou, M. (1999). The laborious way from data mining to web mining. International Journal of Computer Systems, Science & Engineering, 14, 113–126.
Spiliopoulou, M. (2000). Web usage mining for site evaluation: Making a site better fit its users. Communications of the ACM, 43(8), 127–134.
Spiliopoulou, M. and Masand, B. (Eds.) (2000). Advances in Web Usage Analysis and User Profiling. Barlin: Springer.
Spiliopoulou, M. & Pohle, C. (2001). Data Mining for Measuring and Improving the Success of Web Sites. Data Mining and Knowledge Discovery, 5, 85–14.
Srikant, R. & Agrawal, R. (1996). Mining sequential patterns: Generalizations and performance improvements. In EDBT (pp. 3–17). Avignon, France, March 1996.
Srivastava, J. Cooley, R., Deshpande, M., & Tan, P.-N. (2000). Web usage mining: discovery and application of usage patterns from web data. SIGKDD Explorations, 1, 12–23.
Wang, K. (1997). Discovering patterns from large and dynamic sequential data. Intelligent Information Systems, 9, 8–33.
Ware, C. (2000). Information Visualization. Perception for Design. San Diego,CA: Academic Press.
World Wide Web Committee Web Usage Characterization Activity. (1999). W3C Working Draft: Web Characterization Terminology & Definitions Sheet. www.w3.org/1999/05/WCA-terms/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Berendt, B. (2002). Detail and Context in Web Usage Mining: Coarsening and Visualizing Sequences. In: Kohavi, R., Masand, B.M., Spiliopoulou, M., Srivastava, J. (eds) WEBKDD 2001 — Mining Web Log Data Across All Customers Touch Points. WebKDD 2001. Lecture Notes in Computer Science(), vol 2356. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45640-6_1
Download citation
DOI: https://doi.org/10.1007/3-540-45640-6_1
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43969-1
Online ISBN: 978-3-540-45640-7
eBook Packages: Springer Book Archive