Abstract
Sequence pattern mining is one of the most important methods for mining WWW access log. The Apriori algorithm is well known as a typical algorithm for sequence pattern mining. However, it suffers from inherent difficulties in finding long sequential patterns and in extracting interesting patterns among a huge amount of results.
This article proposes a new method for finding generalized sequence pattern by matrix clustering. This method decomposes a sequence into a set of sequence elements, each of which corresponds to an ordered pair of items. Then matrix clustering is applied to extract a cluster of similar sequences. The resulting sequence elements are composed into a generalized sequence.
Our method is evaluated with practical WWW access log, which shows that it is practically useful in finding long sequences and in presenting the generalized sequence in a graph.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Berry, M.J.A., Linoff, G.: Data Mining Technologies: for marketing, sales, and customer support. John Wiley & Sons, Chichester (1997)
Fukuda, T., Morimoto, Y., Tokuyama, T.: Data Mining, Kyoritsu-Pub. (2001) (in Japanese)
Han, J., Lakshmanan, L.V.S., Pei, J.: Scalable Frequent-Pattern Mining Methods: An Overview, in: Tutorial Notes of KDD 2001 (2001)
Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules. In: Proc. 20th VLDB Conf. (1994)
Agrawal, R., Srikant, R.: Mining sequential patterns. In: Proc. Intl. Conf. Data Engineering (1995)
Srikant, R., Agrawal, R.: Mining Sequential Patterns: Generalizations and Performance Improvements. In: Apers, P.M.G., Bouzeghoub, M., Gardarin, G. (eds.) EDBT 1996. LNCS, vol. 1057. Springer, Heidelberg (1996)
Mannila, H., Toivonen, H., Verkamo, A.: Discovery of frequent episodes in event sequences. Data Mining and Knowledge Discovery 1, 259–289 (1997)
Han, J., Pei, J., Asl, M., Chen, Q., Dayal, U., Hsu, M.: FreeSpan: Frequent Pattern- Projected Sequential Pattern Mining. In: ACM Proc. KDD 2000 (2000)
Pei, J., Han, J., Asl, M., Pinto, H., Chen, Q., Dayal, U., Hsu, M.: PrefixSpan: Mining Sequential Pattern Efficiently by Prefix-Projected Pattern Growth. In: Proc. 2001 Intl. Conf. on Data Engineering, ICDE 2001 (2001)
Bayardo Jr., R.J.: Efficiently Mining Long Patterns from Databases. In: ACM Proc. SIGMOD (1998)
Srivastava, J., Cooley, R., Deshpande, M., Tan, P.: Web Usage Mining: Discovery and Applications of Usage Patterns fromWeb Data. SIGKDD Explorations 1(2) (2000)
Mobasher, B., Cooley, R., Srivastava, J.: Automatic Personalization Based on Web Usage Mining. Comm. ACM 43(8), 142–151 (2000)
Schafer, J., Konstan, J., Riedl, J.: E-Commerce Recommendation Applications. in: ACM Conference on EC (2000)
Sarwar, B., Karypis, G., Konstan, J., Riedl, J.: Analysis of Recommendation Algorithms for E-Commerce. In: ACM Conference on EC (2000)
Herlocker, J.L., Konstan, J.A., Borchers, A., Riedl, J.: An Algorithmic Framework for Performing Collaborative Filtering. In: Conf. Research and Development in Information Retrieval (1999)
Dhillon, I.: Co-clustering documents and words using bipartite spectral graph partitioning. In: KDD 2001, pp. 269–274. ACM, New York (2001)
Baker, L.D., McCallum, A.K.: Distributional clustering of words for text classification. In: ACM SIGIR Conference (1998)
Kohavi, R.: Mining E-Commerce Data: The Good, the Bad, and the Ugly. In: SIGKDD 2001 (2001)
Oyanagi, S., Kubota, K., Nakase, A.: Matrix Clustering: A New Data Mining Method for CRM. Trans. IPSJ 42(8), 2156–2166 (2001) (in Japanese)
Oyanagi, S., Kubota, K., Nakase, A.: Application of Matrix Clustering to Web Log Analysis and Access Prediction. In: WEBKDD 2001 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Oyanagi, S., Kubota, K., Nakase, A. (2003). Mining WWW Access Sequence by Matrix Clustering. In: Zaïane, O.R., Srivastava, J., Spiliopoulou, M., Masand, B. (eds) WEBKDD 2002 - Mining Web Data for Discovering Usage Patterns and Profiles. WebKDD 2002. Lecture Notes in Computer Science(), vol 2703. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39663-5_8
Download citation
DOI: https://doi.org/10.1007/978-3-540-39663-5_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20304-9
Online ISBN: 978-3-540-39663-5
eBook Packages: Springer Book Archive