Mining Compressed Sequential Patterns

Chang, Lei; Yang, Dongqing; Tang, Shiwei; Wang, Tengjiao

doi:10.1007/11811305_83

Lei Chang²²,
Dongqing Yang²²,
Shiwei Tang²² &
…
Tengjiao Wang²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4093))

Included in the following conference series:

International Conference on Advanced Data Mining and Applications

2370 Accesses

Abstract

Current sequential pattern mining algorithms often produce a large number of patterns. It is difficult for a user to explore in so many patterns and get a global view of the patterns and the underlying data. In this paper, we examine the problem of how to compress a set of sequential patterns using only K SP-Features(Sequential Pattern Features). A novel similarity measure is proposed for clustering SP-Features and an effective SP-Feature combination method is designed. We also present an efficient algorithm, called CSP(Compressing Sequential Patterns) to mine compressed sequential patterns based on the hierarchical clustering framework. A thorough experimental study with both real and synthetic datasets shows that CSP can compress sequential patterns effectively.

This work is supported by the National Natural Science Foundation of China under Grant No. 60473051.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

A Novel Sequential Pattern Mining Algorithm for Large Scale Data Sequences

Tree-Miner: Mining Sequential Patterns from SP-Tree

Mining Sequential Correlation with a New Measure

References

Afrati, F., Gionis, A., Mannila, H.: Approximating a Collection of Frequent Sets. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 12–19 (2004)
Google Scholar
Agrawal, R., Srikant, R.: Mining Sequential Patterns. In: Proceedings of the Eleventh International Conference on Data Engineering, pp. 3–14 (1995)
Google Scholar
Chang, L., Yang, D., Tang, S., Wang, T.: Mining Compressed Sequential Patterns. Technical Report PKUCS-R-2006-3-105, Department of Computer Science & Technology, Peking University (2006)
Google Scholar
Gribskov, M., McLachlan, A., Eisenberg, D.: Profile analysis: Detection of distantly related proteins. In: Proceeding of National Academy Science, pp. 4355–4358 (1987)
Google Scholar
Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H., Chen, Q., Dayal, U., Hsu, M.: PrefixSpan: Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth. In: Proceedings of International Conference on Data Engineering, pp. 215–224 (2001)
Google Scholar
Stoye, J., Evers, D., Meyer, F.: Rose: generating sequence families. Bioinformatics 14(2), 157–163 (1998)
Article Google Scholar
Xin, D., Han, J., Yan, X., Cheng, H.: Mining Compressed Frequent-Pattern Sets. In: Proceedings of International Conference on Very Large Data Bases, pp. 709–720 (2005)
Google Scholar
Yan, X., Han, J., Afshar, R.: CloSpan: Mining Closed Sequential Patterns in Large Datasets. In: Proceddings of SIAM International Conference on Data Mining (2003)
Google Scholar
Yan, X., Cheng, H., Han, J., Xin, D.: Summarizing Itemset Patterns: A Profile-Based Approach. In: Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 314–323 (2005)
Google Scholar
Yang, J., Wang, W., Yu, S.P., Han, J.: Mining Long Sequential Patterns in a Noisy Environment. In: Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, pp. 406–417 (2002)
Google Scholar
Wang, J., Han, J.: BIDE: Efficient Mining of Frequent Closed Sequences. In: Proceedings of International Conference on Data Engineering, pp. 79–90 (2004)
Google Scholar
Zaki, M.J.: SPADE: An Efficient Algorithm for Mining Frequent Sequences. Machine Learning 42(1/2), 31–60 (2001)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science & Technology, Peking University, Beijing, China
Lei Chang, Dongqing Yang, Shiwei Tang & Tengjiao Wang

Authors

Lei Chang
View author publications
You can also search for this author in PubMed Google Scholar
Dongqing Yang
View author publications
You can also search for this author in PubMed Google Scholar
Shiwei Tang
View author publications
You can also search for this author in PubMed Google Scholar
Tengjiao Wang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Information Technology and Electronic Engineering, The University of Queensland, Queensland, Australia
Xue Li
University of Alberta, Canada
Osmar R. Zaïane
Northwest Polytechnical University, China
Zhanhuai Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chang, L., Yang, D., Tang, S., Wang, T. (2006). Mining Compressed Sequential Patterns. In: Li, X., Zaïane, O.R., Li, Z. (eds) Advanced Data Mining and Applications. ADMA 2006. Lecture Notes in Computer Science(), vol 4093. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11811305_83

Download citation

DOI: https://doi.org/10.1007/11811305_83
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-37025-3
Online ISBN: 978-3-540-37026-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Mining Compressed Sequential Patterns

Abstract

Access this chapter

Preview

Similar content being viewed by others

A Novel Sequential Pattern Mining Algorithm for Large Scale Data Sequences

Tree-Miner: Mining Sequential Patterns from SP-Tree

Mining Sequential Correlation with a New Measure

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Mining Compressed Sequential Patterns

Abstract

Access this chapter

Preview

Similar content being viewed by others

A Novel Sequential Pattern Mining Algorithm for Large Scale Data Sequences

Tree-Miner: Mining Sequential Patterns from SP-Tree

Mining Sequential Correlation with a New Measure

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation