Abstract
When analyzing patterns in server side data, it becomes quickly apparent that some of the data originating from the client is lost, mainly due to the caching of web pages. Missing data is a very important issue when using server side data to analyze a user’s browsing behavior, since the quality of the browsing patterns that can be identified depends on the quality of the data. In this paper, we present a series of experiments to demonstrate the extent of the data loss in different browsing environments and illustrate the difference this makes in the resulting browsing patterns when visualized as footstep graphs. We propose an algorithm, called the P attern R estore M ethod (PRM), for restoring some of the data that has been lost and evaluate the efficiency and accuracy of this algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Berendt, B., Mobasher, B., Nakagawa, M., Spiliopoulou, M.: The Impact of Site Structure and User Environment on Session Reconstruction in Web Usage Analysis. In: Proceedings of the WebKDD 2002 Workshop, Edmonton, Alberta, Canada, July 2002, pp. 159–179 (2002)
Clickstream Technologies Plc.: Technical White Paper: A clickstream Though-leadership Paper, http://www.clickstream.com/docs/cswhitepaper.pdf (Access date: September 6, 2004)
Cooley, R., Mobasher, B., Srivastava, J.: Data Preparation for Mining World Wide Web Browsing Patterns. Knowledge and Information System 1(1), 5–32 (1999)
Eirinaki, M., Vazirgiannis, M.: Web Mining for Web Personalization. ACM Transactions on Internet Technology 3(1), 1–27 (2003)
Fenstermacher, K.D., Ginsburg, M.: Mining Client-Side Activity for Personalization. In: Proceedings of the Fourth Workshop on Advanced Issues in Electronic Commerce and Web Information Systems, Newport Beach, California, USA, June 2002, pp. 26–28 (2002)
Kohavi, R.: Mining E-commerce Data: The Good, the Bad, and the Ugly. In: Proceedings of the KDD 2001 Conference, San Francisco, CA, USA, pp. 8–13 (2001)
Lee, J., Podlaseck, M., Schonberg, E., Hoch, R.: Visualization and analysis of clickstream data of online stores for understanding web merchandising. Journal of data mining and knowledge discovery 5, 59–84 (2001)
Pierrakos, D., Paliouras, G., Papatheodorou, C., Spyropoulos, C.D.: Web Usage Mining as a Tool for Personalization: A Survey. User Modeling and User-Adapted Interaction 13, 311–372 (2003)
Spiliopoulou, M., Mobasher, B., Berendt, B., Nakagawa, M.: A Framework for the Evaluation of Session Reconstruction Heuristics in Web Usage Analysis. INFORMS Journal of Computing, Special Issue on Mining Web-Based Data for E-Business Applications 15(2), 171–190 (2003)
Tan, P.N., Kumar, V.: Discovery of the Web Robot Sessions Based on their Navigational Patterns. Data Mining and Knowledge Discovery 6, 9–35 (2002)
Ting, I.H., Kimble, C., Kudenko, D.: Visualizing and Classifying the Pattern of User’s Browsing Behavior for Website Design Recommendation. In: Proceedings of First International Workshop on Knowledge Discovery in Data Stream (ECML/PKDD 2004), Pisa, Italy, September 20-24, pp. 101–102 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ting, IH., Kimble, C., Kudenko, D. (2005). A Pattern Restore Method for Restoring Missing Patterns in Server Side Clickstream Data. In: Zhang, Y., Tanaka, K., Yu, J.X., Wang, S., Li, M. (eds) Web Technologies Research and Development - APWeb 2005. APWeb 2005. Lecture Notes in Computer Science, vol 3399. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-31849-1_49
Download citation
DOI: https://doi.org/10.1007/978-3-540-31849-1_49
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25207-8
Online ISBN: 978-3-540-31849-1
eBook Packages: Computer ScienceComputer Science (R0)