Constraint-based sequential pattern mining: the pattern-growth methods

Pei, Jian; Han, Jiawei; Wang, Wei

doi:10.1007/s10844-006-0006-z

Constraint-based sequential pattern mining: the pattern-growth methods

Published: 20 January 2007

Volume 28, pages 133–160, (2007)
Cite this article

Journal of Intelligent Information Systems Aims and scope Submit manuscript

Jian Pei¹,
Jiawei Han² &
Wei Wang³

660 Accesses
154 Citations
3 Altmetric
Explore all metrics

Abstract

Constraints are essential for many sequential pattern mining applications. However, there is no systematic study on constraint-based sequential pattern mining. In this paper, we investigate this issue and point out that the framework developed for constrained frequent-pattern mining does not fit our mission well. An extended framework is developed based on a sequential pattern growth methodology. Our study shows that constraints can be effectively and efficiently pushed deep into the sequential pattern mining under this new framework. Moreover, this framework can be extended to constraint-based structured pattern mining as well.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Sequential Pattern Mining

New approaches for mining regular high utility sequential patterns

Article 10 July 2021

An Empirical Evaluation of Sequential Pattern Mining Algorithms

References

Agrawal, R., Imielinski, T., & Swami, A. (1993). Mining association rules between sets of items in large databases. In Proc. 1993 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD’93) (pp. 207–216). New York: ACM.
Chapter Google Scholar
Agrawal, R., & Srikant, R. (1994). Fast algorithms for mining association rules. In Proc. 1994 Int. Conf. Very Large Data Bases (VLDB’94) (pp. 487–499). California: Morgan Kaufmann.
Google Scholar
Agrawal, R., & Srikant, R. (1995). Mining sequential patterns. In Proc. 1995 Int. Conf. Data Engineering (ICDE’95) (pp. 3–14). Washington, District of Columbia: IEEE Computer Society.
Chapter Google Scholar
Ayres, J., Flannick, J., Gehrke, J., & Yiu, T. (2002). Sequential pattern mining using a bitmap representation. In Proc. 2002 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD’02) (pp. 429–435). New York: ACM.
Chapter Google Scholar
Bayardo, R.J., Agrawal, R., & Gunopulos, D. (1999). Constraint-based rule mining on large, dense data sets. In Proc. 1999 Int. Conf. Data Engineering (ICDE’99) (pp. 188–197). Washington, District of Columbia: IEEE Computer Society.
Google Scholar
Chiu, D.-Y., Wu, Y.-H., & Chen, A.L.P. (2004). An efficient algorithm for mining frequent sequences by a new strategy without support counting. In Proc. of the Twentieth IEEE International Conference on Data Engineering (ICDE’04) (pp. 275–286). Boston, Massachusetts: IEEE Computer Society.
Google Scholar
Garofalakis, M., Rastogi, R., & Shim, K. (1999). SPIRIT: Sequential pattern mining with regular expression constraints. In Proc. 1999 Int. Conf. Very Large Data Bases (VLDB’99) (pp. 223–234). San Francisco, California: Morgan Kaufmann.
Google Scholar
Grahne, G., Lakshmanan, L., & Wang, X. (2000). Efficient mining of constrained correlated sets. In Proc. 2000 Int. Conf. Data Engineering (ICDE’00) (pp. 512–521). Washington, District of Columbia: IEEE Computer Society.
Google Scholar
Han, J., Pei, J., Mortazavi-Asl, B., Chen, Q., Dayal, U., & Hsu, M.C. (2000). FreeSpan: Frequent pattern-projected sequential pattern mining. In Proc. 2000 ACM SIGKDD Int. Conf. Knowledge Discovery in Databases (KDD’00) (pp. 355–359). New York: ACM.
Google Scholar
Kifer, D., Gehrke, J., Bucila, C., & White, W. (2003). How to quickly find a witness. In Proc. 2003 ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS’03) (pp. 272–283). New York: ACM.
Chapter Google Scholar
Kum, H.C.M., Pei, J., & Wang, W. (2003). Approxmap : Approximate mining of consensus sequential patterns. In Proc. 2003 SIAM Int. Conf. Data Mining (pp. 311–315). San Francisco, California.
Mannila, H., Toivonen, H., & Verkamo, A.I. (1997). Discovery of frequent episodes in event sequences. Data Min. Knowl. Discov., 1:259–289.
Article Google Scholar
Ng, R., Lakshmanan, L.V.S., Han, J., & Pang, A. (1998). Exploratory mining and pruning optimizations of constrained associations rules. In Proc. 1998 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD’98) (pp. 13–24). New York: ACM.
Chapter Google Scholar
Pei, J., & Han, J. (2000). Can we push more constraints into frequent pattern mining? In Proc. 2000 ACM SIGKDD Int. Conf. Knowledge Discovery in Databases (KDD’00) (pp. 350–354). New York: ACM.
Chapter Google Scholar
Pei, J., Han, J., & Lakshmanan, L.V.S. (2001). Mining frequent itemsets with convertible constraints. In Proc. 2001 Int. Conf. Data Engineering (ICDE’01) (pp. 433–442). Washington, District of Columbia: IEEE Computer Society.
Google Scholar
Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H., Chen, Q., Dayal, U., et al. (2001). PrefixSpan: Mining sequential patterns efficiently by prefix-projected pattern growth. In Proc. 2001 Int. Conf. Data Engineering (ICDE’01) (pp. 215–224). Washington, District of Columbia: IEEE Computer Society.
Google Scholar
Pei, J., Han, J., & Wang, W. (2002). Constraint-based sequential pattern mining in large databases. In Proc. 2002 Int. Conf. on Information and Knowledge Management (CIKM’02) (pp. 18–25). New York: ACM.
Chapter Google Scholar
Pinto, H., Han, J., Pei, J., Wang, K., Chen, Q., & Dayal, U. (2001). Multi-dimensional sequential pattern mining. In Proc. 2001 Int. Conf. Information and Knowledge Management (CIKM’01) (pp. 81–88). New York: ACM.
Chapter Google Scholar
Srikant, R., & Agrawal, R. (1996). Mining sequential patterns: Generalizations and performance improvements. In Proc. 5th Int. Conf. Extending Database Technology (EDBT’96) (pp. 3–17). Berlin Heidelberg New York: Springer.
Google Scholar
Tzvetkov, P., Yan, X., & Han, J. (2003). Tsp: Mining top-k closed sequential patterns. In Proc of the Third IEEE International Conference on Data Mining (ICDM’03) (p. 347). Washington, District of Columbia: IEEE Computer Society.
Google Scholar
Wang, K., & Tan, J. (1996). Incremental discovery of sequential patterns. In Proc 1996 SIGMOD’96 Workshop Research Issues on Data Mining and Knowledge Discovery (DMKD’96) (pp. 95–102). New York: ACM.
Google Scholar
Yan, X., Han, J., & Afshar, R. (2003). CloSpan: Mining closed sequential patterns in large databases. In Proc 2003 SIAM Int Conf Data Mining (pp. 406–417). New York: ACM.
Google Scholar
Yang, J., Yu, P.S., Wang, W., & Han, J. (2002). Mining long sequential patterns in a noisy environment. In Proc 2002 ACM-SIGMOD Int Conf on Management of Data (SIGMOD’02) (pp. 68–75). New York: ACM.
Google Scholar
Zaki, M.J. (1998). Efficient enumeration of frequent sequences. In Proc. 7th Int. Conf. Information and Knowledge Management (CIKM’98) (pp. 68–75). Washington, District of Columbia.
Zaki, M.J. (2001). Spade: An efficient algorithm for mining frequent sequences. Mach. Learn., 42 (1-2),31–60.
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

School of Computing Science, Simon Fraser University, British Columbia, Canada
Jian Pei
University of Illinois at Urbana-Champaign, Urbana, USA
Jiawei Han
Fudan University, Shanghai, China
Wei Wang

Authors

Jian Pei
View author publications
You can also search for this author in PubMed Google Scholar
Jiawei Han
View author publications
You can also search for this author in PubMed Google Scholar
Wei Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jian Pei.

Additional information

This research is supported in part by NSERC Grant 312194-05, NSF Grants IIS-0308001, IIS-0513678, BDI-0515813 and National Science Foundation of China (NSFC) grants No. 60303008 and 69933010. All opinions, findings, conclusions and recommendations in this paper are those of the authors and do not necessarily reflect the views of the funding agencies.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pei, J., Han, J. & Wang, W. Constraint-based sequential pattern mining: the pattern-growth methods. J Intell Inf Syst 28, 133–160 (2007). https://doi.org/10.1007/s10844-006-0006-z

Download citation

Received: 31 January 2003
Revised: 23 March 2005
Accepted: 28 June 2005
Published: 20 January 2007
Issue Date: April 2007
DOI: https://doi.org/10.1007/s10844-006-0006-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Constraint-based sequential pattern mining: the pattern-growth methods

Abstract

Access this article

Similar content being viewed by others

Sequential Pattern Mining

New approaches for mining regular high utility sequential patterns

An Empirical Evaluation of Sequential Pattern Mining Algorithms

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Constraint-based sequential pattern mining: the pattern-growth methods

Abstract

Access this article

Similar content being viewed by others

Sequential Pattern Mining

New approaches for mining regular high utility sequential patterns

An Empirical Evaluation of Sequential Pattern Mining Algorithms

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation