Using incremental Web log mining to create adaptive web servers

Kamdar, Tapan; Joshi, Anupam

doi:10.1007/s00799-003-0057-5

Using incremental Web log mining to create adaptive web servers

Regular contribution
Published: 01 April 2005

Volume 5, pages 133–150, (2005)
Cite this article

International Journal on Digital Libraries Aims and scope Submit manuscript

Tapan Kamdar¹ &
Anupam Joshi¹

98 Accesses
4 Citations
Explore all metrics

Abstract

Personalization of content returned from a Web site is an important problem in general and affects e-commerce and e-services in particular. Targeting appropriate information or products to the end user can significantly change (for the better) the user experience on a Web site. One possible approach to Web personalization is to mine typical user profiles from the vast amount of historical data stored in access logs. We present a system that mines the logs to obtain profiles and uses them to automatically generate a Web page containing URLs the user might be interested in. Profiles generated are only based on the prior traversal patterns of the user on the Web site and do not involve providing any declarative information or require the user to log in. Profiles are dynamic in nature. With time, a user’s traversal pattern changes. To reflect changes to the personalized page generated for the user, the profiles have to be regenerated, taking into account the existing profile. Instead of creating a new profile, we incrementally add and/or remove information from a user profile, aiming to save time as well as physical memory requirements.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: Proceedings of the 20th VLDB conference, Santiago, Chile, month 1994, pp 487–499
Amazon. http://www.amazon.com
Armstrong R, Joachims D, Freitag T, Mitchell T (1995) Webwatcher: a learning apprentice for the World Wide Web. In: Proceedings of the AAAI spring symposium on information gathering from heterogeneous, distributed environments, Stanford, CA, March 1995, pp 6–13
Arocena G, Mendelz A (1998) Weboql: restructuring documents, databases, and web. In: Proceedings of the IEEE international conference on data engineering ’98, location, month 1998. IEEE Press, New York
Bajcsy P, Ahuja N (1998) Location- and density-based hierarchical clustering using similarity analysis. IEEE Trans Patt Anal Mach Intell 20:1011–1015
Article Google Scholar
Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithms. Plenum, New York
Buchner A, Mulvenna M (1998) Discovering internet market intelligence through online analytical web usage mining. SIGMOD Rec 27(4):54–61
Article Google Scholar
Charikar M, Chekuri C, Feder T, Motwani R (1997) Incremental clustering and dynamic information retrieval. In: Proceedings of the 29th ACM symposium on theory of computing, location, month 1997, pp 626–635
Chen MS, Park J-S, Yu PS (1998) Efficient data mining for path traversal patterns. IEEE Trans Knowl Data Eng 10(2):209–221
Article Google Scholar
Cooley R, Mobasher B, Srivastav J (1997) Web mining: information and pattern discovery on the World Wide Web. In: Proceedings of the IEEE international conference on tools with AI, Newport Beach, CA, month 1997, pp 558–567
El Sonbaty Y, Ismail MA (1998) Fuzzy clustering for symbolic data. IEEE Trans Fuzzy Sys 6:195–204
Article Google Scholar
Ester M, Kriegel HP, Sander J, Wimmer M, Xiaowei X (1998) Incremental clustering for mining in a data warehousing environment. In: Proceedings of the 24th international conference on very large data bases, New York, August 1998, pp 323–333
Firefly. http://www.firefly.com
Fisher DH (1987) Knowledge acquisition via incremental conceptual clustering. Mach Learn 2:pages
Fu KS (1982) Syntactic pattern recognition and applications. Academic, San Diego
Gowda KC, Diday E (1992) Symbolic clustering using a new similarity measure. IEEE Trans Sys Man Cybern 20:368–377
Article Google Scholar
Guha S, Rastogi R, Shim K (1998) CURE: an efficient algorithm for large databases. In: Proceedings of SIGMOD ’98, Seattle, June 1998, pp 73–84
Hathaway RJ, Bezdek JC (1993) Switching regression models and fuzzy clustering. IEEE Trans Fuzzy Sys 1(3):195–204
Article Google Scholar
Joshi A, Krishnapuram R (1998) Robust fuzzy clustering methods to support web mining. In: Proceedings of the SIGMOD workshop on data mining and knowledge discovery, location, month 1998, 15:1–8
Joshi A, Jiang Z (2001) Retriever: improving web search engine results using clustering. In: Gangopadhyay A (ed) Business with electronic commerce: issues and trends. Idea Press
Joshi A, Weerawarana S, Houstis E (1997) On disconnected browsing of distributed information. In: Proceedings of the IEEE international workshop on research issues in data engineering (RIDE), Birmingham, UK, month 1997, pp 101–108
Joshi A, Punyapu C, Karnam P (1998) Personalization and asynchronicity to support mobile web access. In: Proceedings of the workshop on Web information and data management, 7th international conference on information and knowledge management, location, November 1998, pages
Joshi A, Joshi K, Krishnapuram R (1999) On mining web access logs. Technical report, CS Department, University of Maryland Baltimore County, Bethesda, MD
Kaufman L, Rousseeuw PJ (1987) Clustering by means of medoids. In: Dodge Y (ed) Statistical data analysis based on the L₁ norm, North Holland/Elsevier, Amsterdam, pp 405–416
Kaufman L, Rousseeuw PJ (1990) Finding groups in data: an introduction to cluster analysis. Wiley, Brussels, Belgium
Krishnapuram R, Keller JM (1993) A possibilistic approach to clustering. IEEE Trans Fuzzy Sys 1(2):98–110
Article Google Scholar
Krishnapuram R, Joshi A, Nasraoui O, Yi L (2001) Low complexiy fuzzy relational clustering algorithms for web mining. IEEE Trans Fuzzy Sys 9:pages
Nasraoui O, Frigui H, Joshi A, Krishnapuram R () Mining web access logs using relational competitive fuzzy clustering. In: Proceedings of the 8th international fuzzy systems association world congress, location, August 1999, pages
Ng RT, Han J (1994) Efficient and effective clustering methods for spatial data mining. In: Proceedings of the 20th conference on very large data bases, Santiago, Chile, September 1994, pp 144–155
Nottingham M (year) Follow: a session based log analyzing tool. http://www.pobox.com/∼mnot/follow/
Perkowitz M, Etzioni O (1997) Adaptive web sites: an AI challenge. In: Proceedings of the international joint conference on AI – IJCAI97, location, month 1997, pages
Perkowitz M, Etzioni O (1998) Adaptive web sites: automatically synthesizing web pages. In: Proceedings of AAAI ’98, location, month 1998, pages
Ramkumar GD, Swami A (1998) Clustering data without distance functions. Bull IEEE Comput Soc Tech Committee Data Eng 21:9–14
Google Scholar
Shahabi C, Zarkesh A.M, Abidi J, Shah V, Sadri R (1999) Analysis and design of server informative www-sites. In: Proceedings of the ACM conference on information and knowledge management CIKM, Kansas City, month 1999, pages
Sneath PHA, Sokal RR (1973) Numerical taxonomy: the principles and practice of numerical classification. Freeman, San Francisco
Srivastava J, Cooley R, Deshpande M, Tan P-N (2000) Web usage mining: discovery and applications of usage patterns from Web data. SIGKDD Explorat 1(2):pages
Tan P-N, Kumar V (2002) Discovery of web robot sessions based on their navigational patterns. Data Min Knowl Discov 6(1):9–35
Article MathSciNet Google Scholar
Terveen L, Hill W, Amento B (1997) PHOAKS – a system for sharing recommendations. Commun ACM 40(3):59–62
Article Google Scholar
Zaiane O, Han J (1998) Webml: Querying the world-wide web for resources and knowledge. In: Proceedings of the workshop on Web information and data management, 7th international conference on information and knowledge management, location, month 1998, pages
Zaiane OR, Xin M, Han J (1998) Discovering Web access patterns and trends by applying OLAP and data mining technology on Web logs. In: Proceedings of the conference on advances in digital libraries (ADL’98), location, month 1998, pp 19–29
Zamir O, Etzioni O (1998) Web document clustering: a feasibility demonstration. In: Proceedings of SIGIR’98, Melbourne, Australia, August 1998, pp 46–54

Download references

Author information

Authors and Affiliations

Department of Computer Science and Electrical Engineering, University of Maryland Baltimore County, Baltimore, MD, 21250, USA
Tapan Kamdar & Anupam Joshi

Authors

Tapan Kamdar
View author publications
You can also search for this author in PubMed Google Scholar
Anupam Joshi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tapan Kamdar.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kamdar, T., Joshi, A. Using incremental Web log mining to create adaptive web servers. Int J Digit Libr 5, 133–150 (2005). https://doi.org/10.1007/s00799-003-0057-5

Download citation

Published: 01 April 2005
Issue Date: April 2005
DOI: https://doi.org/10.1007/s00799-003-0057-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Using incremental Web log mining to create adaptive web servers

Abstract

Access this article

Similar content being viewed by others

Towards intelligent E-learning systems

A Proactive Intelligent Decision Support System for Predicting the Popularity of Online News

Probability Estimation by an Adapted Genetic Algorithm in Web Insurance

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Using incremental Web log mining to create adaptive web servers

Abstract

Access this article

Similar content being viewed by others

Towards intelligent E-learning systems

A Proactive Intelligent Decision Support System for Predicting the Popularity of Online News

Probability Estimation by an Adapted Genetic Algorithm in Web Insurance

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation