Abstract
Personalization of content returned from a Web site is an important problem in general and affects e-commerce and e-services in particular. Targeting appropriate information or products to the end user can significantly change (for the better) the user experience on a Web site. One possible approach to Web personalization is to mine typical user profiles from the vast amount of historical data stored in access logs. We present a system that mines the logs to obtain profiles and uses them to automatically generate a Web page containing URLs the user might be interested in. Profiles generated are only based on the prior traversal patterns of the user on the Web site and do not involve providing any declarative information or require the user to log in. Profiles are dynamic in nature. With time, a user’s traversal pattern changes. To reflect changes to the personalized page generated for the user, the profiles have to be regenerated, taking into account the existing profile. Instead of creating a new profile, we incrementally add and/or remove information from a user profile, aiming to save time as well as physical memory requirements.
Similar content being viewed by others
References
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: Proceedings of the 20th VLDB conference, Santiago, Chile, month 1994, pp 487–499
Amazon. http://www.amazon.com
Armstrong R, Joachims D, Freitag T, Mitchell T (1995) Webwatcher: a learning apprentice for the World Wide Web. In: Proceedings of the AAAI spring symposium on information gathering from heterogeneous, distributed environments, Stanford, CA, March 1995, pp 6–13
Arocena G, Mendelz A (1998) Weboql: restructuring documents, databases, and web. In: Proceedings of the IEEE international conference on data engineering ’98, location, month 1998. IEEE Press, New York
Bajcsy P, Ahuja N (1998) Location- and density-based hierarchical clustering using similarity analysis. IEEE Trans Patt Anal Mach Intell 20:1011–1015
Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithms. Plenum, New York
Buchner A, Mulvenna M (1998) Discovering internet market intelligence through online analytical web usage mining. SIGMOD Rec 27(4):54–61
Charikar M, Chekuri C, Feder T, Motwani R (1997) Incremental clustering and dynamic information retrieval. In: Proceedings of the 29th ACM symposium on theory of computing, location, month 1997, pp 626–635
Chen MS, Park J-S, Yu PS (1998) Efficient data mining for path traversal patterns. IEEE Trans Knowl Data Eng 10(2):209–221
Cooley R, Mobasher B, Srivastav J (1997) Web mining: information and pattern discovery on the World Wide Web. In: Proceedings of the IEEE international conference on tools with AI, Newport Beach, CA, month 1997, pp 558–567
El Sonbaty Y, Ismail MA (1998) Fuzzy clustering for symbolic data. IEEE Trans Fuzzy Sys 6:195–204
Ester M, Kriegel HP, Sander J, Wimmer M, Xiaowei X (1998) Incremental clustering for mining in a data warehousing environment. In: Proceedings of the 24th international conference on very large data bases, New York, August 1998, pp 323–333
Firefly. http://www.firefly.com
Fisher DH (1987) Knowledge acquisition via incremental conceptual clustering. Mach Learn 2:pages
Fu KS (1982) Syntactic pattern recognition and applications. Academic, San Diego
Gowda KC, Diday E (1992) Symbolic clustering using a new similarity measure. IEEE Trans Sys Man Cybern 20:368–377
Guha S, Rastogi R, Shim K (1998) CURE: an efficient algorithm for large databases. In: Proceedings of SIGMOD ’98, Seattle, June 1998, pp 73–84
Hathaway RJ, Bezdek JC (1993) Switching regression models and fuzzy clustering. IEEE Trans Fuzzy Sys 1(3):195–204
Joshi A, Krishnapuram R (1998) Robust fuzzy clustering methods to support web mining. In: Proceedings of the SIGMOD workshop on data mining and knowledge discovery, location, month 1998, 15:1–8
Joshi A, Jiang Z (2001) Retriever: improving web search engine results using clustering. In: Gangopadhyay A (ed) Business with electronic commerce: issues and trends. Idea Press
Joshi A, Weerawarana S, Houstis E (1997) On disconnected browsing of distributed information. In: Proceedings of the IEEE international workshop on research issues in data engineering (RIDE), Birmingham, UK, month 1997, pp 101–108
Joshi A, Punyapu C, Karnam P (1998) Personalization and asynchronicity to support mobile web access. In: Proceedings of the workshop on Web information and data management, 7th international conference on information and knowledge management, location, November 1998, pages
Joshi A, Joshi K, Krishnapuram R (1999) On mining web access logs. Technical report, CS Department, University of Maryland Baltimore County, Bethesda, MD
Kaufman L, Rousseeuw PJ (1987) Clustering by means of medoids. In: Dodge Y (ed) Statistical data analysis based on the L1 norm, North Holland/Elsevier, Amsterdam, pp 405–416
Kaufman L, Rousseeuw PJ (1990) Finding groups in data: an introduction to cluster analysis. Wiley, Brussels, Belgium
Krishnapuram R, Keller JM (1993) A possibilistic approach to clustering. IEEE Trans Fuzzy Sys 1(2):98–110
Krishnapuram R, Joshi A, Nasraoui O, Yi L (2001) Low complexiy fuzzy relational clustering algorithms for web mining. IEEE Trans Fuzzy Sys 9:pages
Nasraoui O, Frigui H, Joshi A, Krishnapuram R () Mining web access logs using relational competitive fuzzy clustering. In: Proceedings of the 8th international fuzzy systems association world congress, location, August 1999, pages
Ng RT, Han J (1994) Efficient and effective clustering methods for spatial data mining. In: Proceedings of the 20th conference on very large data bases, Santiago, Chile, September 1994, pp 144–155
Nottingham M (year) Follow: a session based log analyzing tool. http://www.pobox.com/∼mnot/follow/
Perkowitz M, Etzioni O (1997) Adaptive web sites: an AI challenge. In: Proceedings of the international joint conference on AI – IJCAI97, location, month 1997, pages
Perkowitz M, Etzioni O (1998) Adaptive web sites: automatically synthesizing web pages. In: Proceedings of AAAI ’98, location, month 1998, pages
Ramkumar GD, Swami A (1998) Clustering data without distance functions. Bull IEEE Comput Soc Tech Committee Data Eng 21:9–14
Shahabi C, Zarkesh A.M, Abidi J, Shah V, Sadri R (1999) Analysis and design of server informative www-sites. In: Proceedings of the ACM conference on information and knowledge management CIKM, Kansas City, month 1999, pages
Sneath PHA, Sokal RR (1973) Numerical taxonomy: the principles and practice of numerical classification. Freeman, San Francisco
Srivastava J, Cooley R, Deshpande M, Tan P-N (2000) Web usage mining: discovery and applications of usage patterns from Web data. SIGKDD Explorat 1(2):pages
Tan P-N, Kumar V (2002) Discovery of web robot sessions based on their navigational patterns. Data Min Knowl Discov 6(1):9–35
Terveen L, Hill W, Amento B (1997) PHOAKS – a system for sharing recommendations. Commun ACM 40(3):59–62
Zaiane O, Han J (1998) Webml: Querying the world-wide web for resources and knowledge. In: Proceedings of the workshop on Web information and data management, 7th international conference on information and knowledge management, location, month 1998, pages
Zaiane OR, Xin M, Han J (1998) Discovering Web access patterns and trends by applying OLAP and data mining technology on Web logs. In: Proceedings of the conference on advances in digital libraries (ADL’98), location, month 1998, pp 19–29
Zamir O, Etzioni O (1998) Web document clustering: a feasibility demonstration. In: Proceedings of SIGIR’98, Melbourne, Australia, August 1998, pp 46–54
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kamdar, T., Joshi, A. Using incremental Web log mining to create adaptive web servers. Int J Digit Libr 5, 133–150 (2005). https://doi.org/10.1007/s00799-003-0057-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00799-003-0057-5