Abstract
Web transaction data between Web visitors and Web functionalities usually convey user task-oriented behavior pattern. Mining such type of clickstream data will lead to capture usage pattern information. Nowadays Web usage mining technique has become one of most widely used methods for Web recommendation, which customizes Web content to user-preferred style. Traditional techniques of Web usage mining, such as Web user session or Web page clustering, association rule and frequent navigational path mining can only discover usage pattern explicitly. They, however, cannot reveal the underlying navigational activities and identify the latent relationships that are associated with the patterns among Web users as well as Web pages. In this work, we propose a Web recommendation framework incorporating Web usage mining technique based on Probabilistic Latent Semantic Analysis (PLSA) model. The main advantages of this method are, not only to discover usage-based access pattern, but also to reveal the underlying latent factor as well. With the discovered user access pattern, we then present user more interested content via collaborative recommendation. To validate the effectiveness of proposed approach, we conduct experiments on real world datasets and make comparisons with some existing traditional techniques. The preliminary experimental results demonstrate the usability of the proposed approach.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Herlocker, J., et al.: An Algorithmic Framework for Performing Collaborative Filtering. In: Proceedings of the 22nd ACM Conference on Researchand Development in Information Retrieval (SIGIR 1999), Berkeley, CA (1999)
Konstan, J., et al.: Grouplens: Applying Collaborative Filtering to Usenet News. Communications of the ACM 40, 77–87 (1997)
Shardanand, U., Maes, P.: Social Information Filtering: Algorithms for Automating ’Word of Mouth’. In: Proceedings of the Computer-Human Interaction Conference (CHI 1995), Denver, CO (1995)
Han, E., et al.: Hypergraph Based Clustering in High-Dimensional Data Sets: A Summary of Results. IEEE Data Engineering Bulletin 21(1), 15–22 (1998)
Mobasher, B., et al.: Discovery and Evaluation of Aggregate Usage Profiles for Web Personalization. Data Mining and Knowledge Discovery 6(1), 61–82 (2002)
Perkowitz, M., Etzioni, O.: Adaptive Web Sites: Automatically Synthesizing Web Pages. In: Proceedings of the 15th National Conference on Artificial Intelligence. AAAI, Madison (1998)
Agarwal, R., Aggarwal, C., Prasad, V.: A Tree Projection Algorithm for Generation of Frequent Itemsets. Journal of Parallel and Distributed Computing 61(3), 350–371 (1999)
Agrawal, R., Srikant, R., Bocca, J.B., Jarke, M., Zaniolo, C.: Proceedings of the 20th International Conference on Very Large Data Bases (VLDB). Morgan Kaufmann, Santiago (1994)
Agrawal, R., Srikant, R.: Mining Sequential Patterns. In: Proceedings of the International Conference on Data Engineering (ICDE). IEEE Computer Society Press, Taipei (1995)
Joachims, T., Freitag, D., Mitchell, T.: Webwatcher: A tour guide for the world wide web. In: The 15th International Joint Conference on Artificial Intelligence (ICJAI 1997), Nagoya, Japan (1997)
Lieberman, H.: Letizia: An agent that assists web browsing. In: Proc. of the 1995 International Joint Conference on Artificial Intelligence. Morgan Kaufmann, Montreal (1995)
Mobasher, B., Cooley, R., Srivastava, J.: Creating adaptive web sites through usage-based clustering of URLs. In: Proceedings of the 1999 Workshop on Knowledge and Data Engineering Exchange. IEEE Computer Society, Los Alamitos (1999)
Ngu, D.S.W., Wu, X.: Sitehelper: A localized agent that helps incremental exploration of the world wide web. In: Proceedings of 6th International World Wide Web Conference. ACM Press, Santa Clara (1997)
Herlocker, J.L., et al.: Evaluating collaborative filtering recommender systems. ACM Transactions on Information Systems (TOIS) 22(1), 5–53 (2004)
Dunja, M.: Personal Web Watcher: design and implementation, Department of Intelligent Systems, J. Stefan Institute, Slovenia (1996)
Joachims, T., Freitag, D., Mitchell, T.: WebWatcher: A Tour Guide for the World Wide Web. In: Proceedings of the International Joint Conference in AI (IJCAI 1997), Los Angeles (1997)
Baeza-Yates, R., Ribeiro-Neto, B.: Modern information retrieval. Addison Wesley, Sydney (1999)
Deerwester, S., et al.: Indexing by latent semantic analysis. Journal American Society for information retrieval 41(6), 391–407 (1990)
Dumais, S.T.: Latent semantic indexing (LSI): Trec-3 report. In: Proceeding of the Text REtrieval Conference, TREC-3 (1995)
Berry, M.W., Dumais, S.T., O’ Brien, G.W.: Using linear algebra for intelligent information retrieval. SIAM Review 37(4), 573–595 (1995), 0146-4833
Hou, J., Zhang, Y.: Constructing Good Quality Web Page Communities. In: Proc. of the 13th Australasian Database Conferences (ADC 2002). ACS Inc., Melbourne (2002)
Hou, J., Zhang, Y.: Effectively Finding Relevant Web Pages from Linkage Information. IEEE Trans. Knowl. Data Eng. 15(4), 940–951 (2003)
Xu, G., Zhang, Y., Zhou, X.: A Latent Usage Approach for Clustering Web Transaction and Building User Profile. In: Li, X., Wang, S., Dong, Z.Y. (eds.) ADMA 2005. LNCS (LNAI), vol. 3584, pp. 31–42. Springer, Heidelberg (2005)
Hofmann, T.: Probabilistic Latent Semantic Analysis. In: Proc. of the 22nd Annual ACM Conference on Research and Development in Information Retrieval. ACM Press, Berkeley (1999)
Hofmann, T.: Latent Semantic Models for Collaborative Filtering. ACM Transactions on Information Systems 22(1), 89–115 (2004)
Jin, X., Zhou, Y., Mobasher, B.: A Unified Approach to Personalization Based on Probabilistic Latent Semantic Models of Web Usage and Content. In: Proceedings of the AAAI 2004 Workshop on Semantic Web Personalization (SWP 2004), San Jose (2004)
Cohn, D., Chang, H.: Learning to probabilistically identify authoritative documents. In: Proc. of the 17th International Conference on Machine Learning. Morgan Kaufmann, San Francisco (2000)
Hofmann, T.: Unsupervised Learning by Probabilistic Latent Semantic Analysis. Machine Learning Journal 42(1), 177–196 (2001)
Cohn, D., Hofmann, T.: The missing link: A probabilistic model of document content and hypertext connectivity. In: Todd, T.G.D., Leen, K. (eds.) Advances in Neural Information Processing Systems. MIT Press, Cambridge (2001)
Shahabi, C., et al.: Knowledge discovery from user web-page navigational. In: Proceedings of the 7th International Workshop on Research Issues in Data Engineering (RIDE 1997). IEEE Computer Society, Los Alamitos (1997)
Xiao, J., et al.: Measuring similarity of interests for clustering web-users. In: Proceedings of the 12th Australasian Database conference (ADC 2001). ACS Inc., Queensland (2001)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. Journal Royal Statist. Soc. B 39(2), 1–38 (1977)
Xu, G., et al.: Discovering User Access Pattern Based on Probabilistic Latent Factor Model. In: Proceeding of 16th Australasian Database Conference. ACS Inc., Newcastle (2004)
Xu, G., Zhang, Y., Zhou, X.: Using Probabilistic Semantic Latent Analysis for Web Page Grouping. In: 15th International Workshop on Research Issues on Data Engineering: Stream Data Mining and Applications (RIDE-SDMA 2005), Tyoko, Japan (2005)
Mobasher, B.: Web Usage Mining and Personalization. In: Singh, M.P. (ed.) Practical Handbook of Internet Computing. CRC Press, Boca Raton (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Xu, G., Zhang, Y., Zhou, X. (2005). A Web Recommendation Technique Based on Probabilistic Latent Semantic Analysis. In: Ngu, A.H.H., Kitsuregawa, M., Neuhold, E.J., Chung, JY., Sheng, Q.Z. (eds) Web Information Systems Engineering – WISE 2005. WISE 2005. Lecture Notes in Computer Science, vol 3806. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11581062_2
Download citation
DOI: https://doi.org/10.1007/11581062_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-30017-5
Online ISBN: 978-3-540-32286-3
eBook Packages: Computer ScienceComputer Science (R0)