Skip to main content

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2703))

Abstract

Sequence pattern mining is one of the most important methods for mining WWW access log. The Apriori algorithm is well known as a typical algorithm for sequence pattern mining. However, it suffers from inherent difficulties in finding long sequential patterns and in extracting interesting patterns among a huge amount of results.

This article proposes a new method for finding generalized sequence pattern by matrix clustering. This method decomposes a sequence into a set of sequence elements, each of which corresponds to an ordered pair of items. Then matrix clustering is applied to extract a cluster of similar sequences. The resulting sequence elements are composed into a generalized sequence.

Our method is evaluated with practical WWW access log, which shows that it is practically useful in finding long sequences and in presenting the generalized sequence in a graph.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Berry, M.J.A., Linoff, G.: Data Mining Technologies: for marketing, sales, and customer support. John Wiley & Sons, Chichester (1997)

    Google Scholar 

  2. Fukuda, T., Morimoto, Y., Tokuyama, T.: Data Mining, Kyoritsu-Pub. (2001) (in Japanese)

    Google Scholar 

  3. Han, J., Lakshmanan, L.V.S., Pei, J.: Scalable Frequent-Pattern Mining Methods: An Overview, in: Tutorial Notes of KDD 2001 (2001)

    Google Scholar 

  4. Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules. In: Proc. 20th VLDB Conf. (1994)

    Google Scholar 

  5. Agrawal, R., Srikant, R.: Mining sequential patterns. In: Proc. Intl. Conf. Data Engineering (1995)

    Google Scholar 

  6. Srikant, R., Agrawal, R.: Mining Sequential Patterns: Generalizations and Performance Improvements. In: Apers, P.M.G., Bouzeghoub, M., Gardarin, G. (eds.) EDBT 1996. LNCS, vol. 1057. Springer, Heidelberg (1996)

    Chapter  Google Scholar 

  7. Mannila, H., Toivonen, H., Verkamo, A.: Discovery of frequent episodes in event sequences. Data Mining and Knowledge Discovery 1, 259–289 (1997)

    Article  Google Scholar 

  8. Han, J., Pei, J., Asl, M., Chen, Q., Dayal, U., Hsu, M.: FreeSpan: Frequent Pattern- Projected Sequential Pattern Mining. In: ACM Proc. KDD 2000 (2000)

    Google Scholar 

  9. Pei, J., Han, J., Asl, M., Pinto, H., Chen, Q., Dayal, U., Hsu, M.: PrefixSpan: Mining Sequential Pattern Efficiently by Prefix-Projected Pattern Growth. In: Proc. 2001 Intl. Conf. on Data Engineering, ICDE 2001 (2001)

    Google Scholar 

  10. Bayardo Jr., R.J.: Efficiently Mining Long Patterns from Databases. In: ACM Proc. SIGMOD (1998)

    Google Scholar 

  11. Srivastava, J., Cooley, R., Deshpande, M., Tan, P.: Web Usage Mining: Discovery and Applications of Usage Patterns fromWeb Data. SIGKDD Explorations 1(2) (2000)

    Google Scholar 

  12. Mobasher, B., Cooley, R., Srivastava, J.: Automatic Personalization Based on Web Usage Mining. Comm. ACM 43(8), 142–151 (2000)

    Article  Google Scholar 

  13. Schafer, J., Konstan, J., Riedl, J.: E-Commerce Recommendation Applications. in: ACM Conference on EC (2000)

    Google Scholar 

  14. Sarwar, B., Karypis, G., Konstan, J., Riedl, J.: Analysis of Recommendation Algorithms for E-Commerce. In: ACM Conference on EC (2000)

    Google Scholar 

  15. Herlocker, J.L., Konstan, J.A., Borchers, A., Riedl, J.: An Algorithmic Framework for Performing Collaborative Filtering. In: Conf. Research and Development in Information Retrieval (1999)

    Google Scholar 

  16. Dhillon, I.: Co-clustering documents and words using bipartite spectral graph partitioning. In: KDD 2001, pp. 269–274. ACM, New York (2001)

    Chapter  Google Scholar 

  17. Baker, L.D., McCallum, A.K.: Distributional clustering of words for text classification. In: ACM SIGIR Conference (1998)

    Google Scholar 

  18. Kohavi, R.: Mining E-Commerce Data: The Good, the Bad, and the Ugly. In: SIGKDD 2001 (2001)

    Google Scholar 

  19. Oyanagi, S., Kubota, K., Nakase, A.: Matrix Clustering: A New Data Mining Method for CRM. Trans. IPSJ 42(8), 2156–2166 (2001) (in Japanese)

    Google Scholar 

  20. Oyanagi, S., Kubota, K., Nakase, A.: Application of Matrix Clustering to Web Log Analysis and Access Prediction. In: WEBKDD 2001 (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Oyanagi, S., Kubota, K., Nakase, A. (2003). Mining WWW Access Sequence by Matrix Clustering. In: Zaïane, O.R., Srivastava, J., Spiliopoulou, M., Masand, B. (eds) WEBKDD 2002 - Mining Web Data for Discovering Usage Patterns and Profiles. WebKDD 2002. Lecture Notes in Computer Science(), vol 2703. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39663-5_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-39663-5_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-20304-9

  • Online ISBN: 978-3-540-39663-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics