Skip to main content

Mining Compressed Sequential Patterns

  • Conference paper
Advanced Data Mining and Applications (ADMA 2006)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4093))

Included in the following conference series:

Abstract

Current sequential pattern mining algorithms often produce a large number of patterns. It is difficult for a user to explore in so many patterns and get a global view of the patterns and the underlying data. In this paper, we examine the problem of how to compress a set of sequential patterns using only K SP-Features(Sequential Pattern Features). A novel similarity measure is proposed for clustering SP-Features and an effective SP-Feature combination method is designed. We also present an efficient algorithm, called CSP(Compressing Sequential Patterns) to mine compressed sequential patterns based on the hierarchical clustering framework. A thorough experimental study with both real and synthetic datasets shows that CSP can compress sequential patterns effectively.

This work is supported by the National Natural Science Foundation of China under Grant No. 60473051.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Afrati, F., Gionis, A., Mannila, H.: Approximating a Collection of Frequent Sets. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 12–19 (2004)

    Google Scholar 

  2. Agrawal, R., Srikant, R.: Mining Sequential Patterns. In: Proceedings of the Eleventh International Conference on Data Engineering, pp. 3–14 (1995)

    Google Scholar 

  3. Chang, L., Yang, D., Tang, S., Wang, T.: Mining Compressed Sequential Patterns. Technical Report PKUCS-R-2006-3-105, Department of Computer Science & Technology, Peking University (2006)

    Google Scholar 

  4. Gribskov, M., McLachlan, A., Eisenberg, D.: Profile analysis: Detection of distantly related proteins. In: Proceeding of National Academy Science, pp. 4355–4358 (1987)

    Google Scholar 

  5. Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H., Chen, Q., Dayal, U., Hsu, M.: PrefixSpan: Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth. In: Proceedings of International Conference on Data Engineering, pp. 215–224 (2001)

    Google Scholar 

  6. Stoye, J., Evers, D., Meyer, F.: Rose: generating sequence families. Bioinformatics 14(2), 157–163 (1998)

    Article  Google Scholar 

  7. Xin, D., Han, J., Yan, X., Cheng, H.: Mining Compressed Frequent-Pattern Sets. In: Proceedings of International Conference on Very Large Data Bases, pp. 709–720 (2005)

    Google Scholar 

  8. Yan, X., Han, J., Afshar, R.: CloSpan: Mining Closed Sequential Patterns in Large Datasets. In: Proceddings of SIAM International Conference on Data Mining (2003)

    Google Scholar 

  9. Yan, X., Cheng, H., Han, J., Xin, D.: Summarizing Itemset Patterns: A Profile-Based Approach. In: Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 314–323 (2005)

    Google Scholar 

  10. Yang, J., Wang, W., Yu, S.P., Han, J.: Mining Long Sequential Patterns in a Noisy Environment. In: Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, pp. 406–417 (2002)

    Google Scholar 

  11. Wang, J., Han, J.: BIDE: Efficient Mining of Frequent Closed Sequences. In: Proceedings of International Conference on Data Engineering, pp. 79–90 (2004)

    Google Scholar 

  12. Zaki, M.J.: SPADE: An Efficient Algorithm for Mining Frequent Sequences. Machine Learning 42(1/2), 31–60 (2001)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chang, L., Yang, D., Tang, S., Wang, T. (2006). Mining Compressed Sequential Patterns. In: Li, X., Zaïane, O.R., Li, Z. (eds) Advanced Data Mining and Applications. ADMA 2006. Lecture Notes in Computer Science(), vol 4093. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11811305_83

Download citation

  • DOI: https://doi.org/10.1007/11811305_83

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-37025-3

  • Online ISBN: 978-3-540-37026-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics