Skip to main content
Log in

Discovery and Evaluation of Aggregate Usage Profiles for Web Personalization

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

Web usage mining, possibly used in conjunction with standard approaches to personalization such as collaborative filtering, can help address some of the shortcomings of these techniques, including reliance on subjective user ratings, lack of scalability, and poor performance in the face of high-dimensional and sparse data. However, the discovery of patterns from usage data by itself is not sufficient for performing the personalization tasks. The critical step is the effective derivation of good quality and useful (i.e., actionable) “aggregate usage profiles” from these patterns. In this paper we present and experimentally evaluate two techniques, based on clustering of user transactions and clustering of pageviews, in order to discover overlapping aggregate profiles that can be effectively used by recommender systems for real-time Web personalization. We evaluate these techniques both in terms of the quality of the individual profiles generated, as well as in the context of providing recommendations as an integrated part of a personalization engine. In particular, our results indicate that using the generated aggregate profiles, we can achieve effective personalization at early stages of users' visits to a site, based only on anonymous clickstream data and without the benefit of explicit input by these users or deeper knowledge about them.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Agarwal, R., Aggarwal, C., and Prasad, V. 1999. A tree projection algorithm for generation of frequent itemsets. In Proceedings of the High Performance Data Mining Workshop, Puerto Rico.

  • Agrawal, R. and Srikant, R. 1994. Fast algorithms for mining association rules. In Proc. 20th Int. Conference on Very Large Data Bases, VLDB94.

  • Agrawal, R. and Srikant, R. 1995. Mining sequential patterns. In Proceedings of the Int'l Conference on Data Engineering (ICDE), Taipei, Taiwan, March 1995.

  • Banerjee, A. and Ghosh, J. 2001. Clickstream clustering using weighted longest common subsequences. In Proceedings of the Web Mining Workshop at the 1st SIAM Conference on Data Mining, Chicago, April 2001.

  • Brin, S., Motwani, R., and Silverstein, C. 1997. Beyond market baskets: Generalizing association rules to correlations. In Proceedings of the ACM SIGMOD International Conference on Management of Data.

  • Buchner, A. and Mulvenna, M.D. 1999. Discovering internet marketing intelligence through online analytical web usage mining. SIGMOD Record, 4:27.

    Google Scholar 

  • Charniak, E. 1996. Statistical Language Learning. MIT Press, Cambridge, Massachusetts.

    Google Scholar 

  • Cooley, R., Mobasher, B., and Srivastava, J. 1999. Data preparation for mining world wide web browsing patterns. Journal of Knowledge and Information Systems, 1:1.

    Google Scholar 

  • Cooley, R., Tan, P.-T., and Srivastava, J. 1999. WebSIFT: The web site information filter system. In Proceedings of the Workshop on Web Usage Analysis and User Profiling (WebKKD99), San Diego, Aug. 1999.

  • Han, E.-H., Boley, D., Gini, M., Gross, R., Hastings, K., Karypis, G., Kumar, V., Mobasher, B., and More, J. 1999. Document categorization and query generation on the world wide web using WebACE. Journal of Artificial Intelligence Review, 13(5-6):365–391.

    Google Scholar 

  • Han, E.-H., Karypis, G., Kumar, V., and Mobasher, B. 1997. Clustering based on association rule hypergraphs. In Proceedings of SIGMOD'97 Workshop on Research Issues in Data Mining and Knowledge Discovery (DMKD'97), May 1997.

  • Han, E.-H., Karypis, G., Kumar, V., and Mobasher, B. 1998. Hypergraph based clustering in high-dimensional data sets: A summary of results. IEEE Bulletin of the Technical Committee on Data Engineering, 21:1.

    Google Scholar 

  • Herlocker, J., Konstan, J., Borchers, A., and Riedl, J. 1999. An algorithmic framework for performing collaborative filtering. In Proceedings of the 1999 Conference on Research and Development in Information Retrieval, Aug. 1999.

  • Karypis, G. and Han, E.-H. 2000. Concept indexing: A fast dimensionality reduction algorithm with applications to document retrieval and categorization. Technical Report #00-016, Department of Computer Science and Engineering, University of Minnesota, March 2000.

  • Konstan, J., Miller, B., Maltz, D., Herlocker, J., Gordon, L., and Riedl, J. 1997. GroupLens: Applying collaborative filtering to usenet news. Communications of the ACM, 40:3.

    Google Scholar 

  • Lewis, D. and Gale, W.A. 1994. A sequential algorithm for training text classifiers. In Proceedings of the 17th Annual ACM-SIGIR Conference, London, UK: Springer-Verlag, Vol. 3, p. 12.

    Google Scholar 

  • Mobasher, B. 1999. A web personalization engine based on user transaction clustering. In Proceedings of the 9th Workshop on Information Technologies and Systems (WITS'99), Dec. 1999.

  • Mobasher, B., Cooley, R., and Srivastava, J. 1999. Creating adaptive web sites through usage-based clustering of urls. In IEEE Knowledge and Data Engineering Workshop (KDEX'99), Nov. 1999.

  • Mobasher, B., Cooley, R., and Srivastava, J. 2000. Automatic personalization based on Web usage mining. Communications of the ACM, 43:8.

    Google Scholar 

  • Nasraoui, O., Frigui, H., Joshi, A., and Krishnapuram, R. 1999. Mining web access logs using relational competitive fuzzy clustering. In Proceedings of the Eight International Fuzzy Systems Association World Congress, Aug. 1999.

  • O'Conner, M. and Herlocker, J. 1999. Clustering items for collaborative filtering. In Proceedings of the ACM SIGIR Workshop on Recommender Systems, Berkeley, CA.

  • Perkowitz, M. and Etzioni, O. 1998. Adaptive Web sites: Automatically synthesizing web pages. In Proceedings of Fifteenth National Conference on Artificial Intelligence, Madison, WI.

  • Sarwar, B.M., Karypis, G., Konstan, J., and Riedl, J. 2000. Analysis of recommender algorithms for e-commerce. In Proceedings of the 2nd ACM E-Commerce Conference (EC'00), Minneapolis, Oct. 2000.

  • Schechter, S., Krishnan, M., and Smith, M.D. 1998. Using path profiles to predict http requests. In Proc. 7th International World Wide Web Conference, Brisbane, Australia, April 1998.

  • Shahabi, C., Zarkesh, A., Adibi, J., and Shah, V. 1997. Knowledge discovery from users web-page navigation. In Proceedings of Workshop on Research Issues in Data Engineering, Birmingham, England.

  • Shardanand, U. and Maes, P. 1995. Social information filtering: Algorithms for automating “word of mouth.” In Proceedings of the ACM CHI Conference (CHI95).

  • Spiliopoulou, M. and Faulstich L.C. 1999. WUM: A web utilization miner. In Proceedings of EDBT Workshop WebDB98, Valencia, Spain, Springer Verlag, LNCS, Vol. 1590.

  • Spiliopoulou, M., Pohle, C., and Faulstich, L.C. 1999. Improving the effectiveness of a web site with web usage mining. In Proceedings of the Workshop on Web Usage Analysis and User Profiling (WebKKD99), San Diego, Aug. 1999.

  • Srivastava, J., Cooley, R., Deshpande, M., and Tan, P.-T. 2000. Web usage mining: Discovery and applications of usage patterns from Web data. SIGKDD Explorations, 1:2.

    Google Scholar 

  • Yan, T., Jacobsen, M., Garcia-Molina, H., and Dayal, U. 1996. From user access patterns to dynamic hypertext linking. In Proceedings of the 5th International World Wide Web Conference, Paris, France.

  • Yu, P.S. 1999. Data mining and personalization technologies. In Proceedings of the Int'l Conference on Database Systems for Advanced Applications (DASFAA99), Hsinchu, Taiwan, April 1999.

  • Zaiane, O.R., Xin, M., and Han, J. 1998. Discovering web access patterns and trends by applying OLAP and data mining technology on web logs. In Advances in Digital Libraries, Santa Barbara, pp. 19–29.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mobasher, B., Dai, H., Luo, T. et al. Discovery and Evaluation of Aggregate Usage Profiles for Web Personalization. Data Mining and Knowledge Discovery 6, 61–82 (2002). https://doi.org/10.1023/A:1013232803866

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1013232803866

Navigation