Skip to main content
Log in

Using incremental Web log mining to create adaptive web servers

  • Regular contribution
  • Published:
International Journal on Digital Libraries Aims and scope Submit manuscript

Abstract

Personalization of content returned from a Web site is an important problem in general and affects e-commerce and e-services in particular. Targeting appropriate information or products to the end user can significantly change (for the better) the user experience on a Web site. One possible approach to Web personalization is to mine typical user profiles from the vast amount of historical data stored in access logs. We present a system that mines the logs to obtain profiles and uses them to automatically generate a Web page containing URLs the user might be interested in. Profiles generated are only based on the prior traversal patterns of the user on the Web site and do not involve providing any declarative information or require the user to log in. Profiles are dynamic in nature. With time, a user’s traversal pattern changes. To reflect changes to the personalized page generated for the user, the profiles have to be regenerated, taking into account the existing profile. Instead of creating a new profile, we incrementally add and/or remove information from a user profile, aiming to save time as well as physical memory requirements.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: Proceedings of the 20th VLDB conference, Santiago, Chile, month 1994, pp 487–499

  2. Amazon. http://www.amazon.com

  3. Armstrong R, Joachims D, Freitag T, Mitchell T (1995) Webwatcher: a learning apprentice for the World Wide Web. In: Proceedings of the AAAI spring symposium on information gathering from heterogeneous, distributed environments, Stanford, CA, March 1995, pp 6–13

  4. Arocena G, Mendelz A (1998) Weboql: restructuring documents, databases, and web. In: Proceedings of the IEEE international conference on data engineering ’98, location, month 1998. IEEE Press, New York

  5. Bajcsy P, Ahuja N (1998) Location- and density-based hierarchical clustering using similarity analysis. IEEE Trans Patt Anal Mach Intell 20:1011–1015

    Article  Google Scholar 

  6. Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithms. Plenum, New York

  7. Buchner A, Mulvenna M (1998) Discovering internet market intelligence through online analytical web usage mining. SIGMOD Rec 27(4):54–61

    Article  Google Scholar 

  8. Charikar M, Chekuri C, Feder T, Motwani R (1997) Incremental clustering and dynamic information retrieval. In: Proceedings of the 29th ACM symposium on theory of computing, location, month 1997, pp 626–635

  9. Chen MS, Park J-S, Yu PS (1998) Efficient data mining for path traversal patterns. IEEE Trans Knowl Data Eng 10(2):209–221

    Article  Google Scholar 

  10. Cooley R, Mobasher B, Srivastav J (1997) Web mining: information and pattern discovery on the World Wide Web. In: Proceedings of the IEEE international conference on tools with AI, Newport Beach, CA, month 1997, pp 558–567

  11. El Sonbaty Y, Ismail MA (1998) Fuzzy clustering for symbolic data. IEEE Trans Fuzzy Sys 6:195–204

    Article  Google Scholar 

  12. Ester M, Kriegel HP, Sander J, Wimmer M, Xiaowei X (1998) Incremental clustering for mining in a data warehousing environment. In: Proceedings of the 24th international conference on very large data bases, New York, August 1998, pp 323–333

  13. Firefly. http://www.firefly.com

  14. Fisher DH (1987) Knowledge acquisition via incremental conceptual clustering. Mach Learn 2:pages

  15. Fu KS (1982) Syntactic pattern recognition and applications. Academic, San Diego

  16. Gowda KC, Diday E (1992) Symbolic clustering using a new similarity measure. IEEE Trans Sys Man Cybern 20:368–377

    Article  Google Scholar 

  17. Guha S, Rastogi R, Shim K (1998) CURE: an efficient algorithm for large databases. In: Proceedings of SIGMOD ’98, Seattle, June 1998, pp 73–84

  18. Hathaway RJ, Bezdek JC (1993) Switching regression models and fuzzy clustering. IEEE Trans Fuzzy Sys 1(3):195–204

    Article  Google Scholar 

  19. Joshi A, Krishnapuram R (1998) Robust fuzzy clustering methods to support web mining. In: Proceedings of the SIGMOD workshop on data mining and knowledge discovery, location, month 1998, 15:1–8

  20. Joshi A, Jiang Z (2001) Retriever: improving web search engine results using clustering. In: Gangopadhyay A (ed) Business with electronic commerce: issues and trends. Idea Press

  21. Joshi A, Weerawarana S, Houstis E (1997) On disconnected browsing of distributed information. In: Proceedings of the IEEE international workshop on research issues in data engineering (RIDE), Birmingham, UK, month 1997, pp 101–108

  22. Joshi A, Punyapu C, Karnam P (1998) Personalization and asynchronicity to support mobile web access. In: Proceedings of the workshop on Web information and data management, 7th international conference on information and knowledge management, location, November 1998, pages

  23. Joshi A, Joshi K, Krishnapuram R (1999) On mining web access logs. Technical report, CS Department, University of Maryland Baltimore County, Bethesda, MD

  24. Kaufman L, Rousseeuw PJ (1987) Clustering by means of medoids. In: Dodge Y (ed) Statistical data analysis based on the L1 norm, North Holland/Elsevier, Amsterdam, pp 405–416

  25. Kaufman L, Rousseeuw PJ (1990) Finding groups in data: an introduction to cluster analysis. Wiley, Brussels, Belgium

  26. Krishnapuram R, Keller JM (1993) A possibilistic approach to clustering. IEEE Trans Fuzzy Sys 1(2):98–110

    Article  Google Scholar 

  27. Krishnapuram R, Joshi A, Nasraoui O, Yi L (2001) Low complexiy fuzzy relational clustering algorithms for web mining. IEEE Trans Fuzzy Sys 9:pages

  28. Nasraoui O, Frigui H, Joshi A, Krishnapuram R () Mining web access logs using relational competitive fuzzy clustering. In: Proceedings of the 8th international fuzzy systems association world congress, location, August 1999, pages

  29. Ng RT, Han J (1994) Efficient and effective clustering methods for spatial data mining. In: Proceedings of the 20th conference on very large data bases, Santiago, Chile, September 1994, pp 144–155

  30. Nottingham M (year) Follow: a session based log analyzing tool. http://www.pobox.com/∼mnot/follow/

  31. Perkowitz M, Etzioni O (1997) Adaptive web sites: an AI challenge. In: Proceedings of the international joint conference on AI – IJCAI97, location, month 1997, pages

  32. Perkowitz M, Etzioni O (1998) Adaptive web sites: automatically synthesizing web pages. In: Proceedings of AAAI ’98, location, month 1998, pages

  33. Ramkumar GD, Swami A (1998) Clustering data without distance functions. Bull IEEE Comput Soc Tech Committee Data Eng 21:9–14

    Google Scholar 

  34. Shahabi C, Zarkesh A.M, Abidi J, Shah V, Sadri R (1999) Analysis and design of server informative www-sites. In: Proceedings of the ACM conference on information and knowledge management CIKM, Kansas City, month 1999, pages

  35. Sneath PHA, Sokal RR (1973) Numerical taxonomy: the principles and practice of numerical classification. Freeman, San Francisco

  36. Srivastava J, Cooley R, Deshpande M, Tan P-N (2000) Web usage mining: discovery and applications of usage patterns from Web data. SIGKDD Explorat 1(2):pages

  37. Tan P-N, Kumar V (2002) Discovery of web robot sessions based on their navigational patterns. Data Min Knowl Discov 6(1):9–35

    Article  MathSciNet  Google Scholar 

  38. Terveen L, Hill W, Amento B (1997) PHOAKS – a system for sharing recommendations. Commun ACM 40(3):59–62

    Article  Google Scholar 

  39. Zaiane O, Han J (1998) Webml: Querying the world-wide web for resources and knowledge. In: Proceedings of the workshop on Web information and data management, 7th international conference on information and knowledge management, location, month 1998, pages

  40. Zaiane OR, Xin M, Han J (1998) Discovering Web access patterns and trends by applying OLAP and data mining technology on Web logs. In: Proceedings of the conference on advances in digital libraries (ADL’98), location, month 1998, pp 19–29

  41. Zamir O, Etzioni O (1998) Web document clustering: a feasibility demonstration. In: Proceedings of SIGIR’98, Melbourne, Australia, August 1998, pp 46–54

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tapan Kamdar.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kamdar, T., Joshi, A. Using incremental Web log mining to create adaptive web servers. Int J Digit Libr 5, 133–150 (2005). https://doi.org/10.1007/s00799-003-0057-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00799-003-0057-5

Keywords

Navigation