Skip to main content
Log in

Constraint-based sequential pattern mining: the pattern-growth methods

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

Constraints are essential for many sequential pattern mining applications. However, there is no systematic study on constraint-based sequential pattern mining. In this paper, we investigate this issue and point out that the framework developed for constrained frequent-pattern mining does not fit our mission well. An extended framework is developed based on a sequential pattern growth methodology. Our study shows that constraints can be effectively and efficiently pushed deep into the sequential pattern mining under this new framework. Moreover, this framework can be extended to constraint-based structured pattern mining as well.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Agrawal, R., Imielinski, T., & Swami, A. (1993). Mining association rules between sets of items in large databases. In Proc. 1993 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD’93) (pp. 207–216). New York: ACM.

    Chapter  Google Scholar 

  • Agrawal, R., & Srikant, R. (1994). Fast algorithms for mining association rules. In Proc. 1994 Int. Conf. Very Large Data Bases (VLDB’94) (pp. 487–499). California: Morgan Kaufmann.

    Google Scholar 

  • Agrawal, R., & Srikant, R. (1995). Mining sequential patterns. In Proc. 1995 Int. Conf. Data Engineering (ICDE’95) (pp. 3–14). Washington, District of Columbia: IEEE Computer Society.

    Chapter  Google Scholar 

  • Ayres, J., Flannick, J., Gehrke, J., & Yiu, T. (2002). Sequential pattern mining using a bitmap representation. In Proc. 2002 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD’02) (pp. 429–435). New York: ACM.

    Chapter  Google Scholar 

  • Bayardo, R.J., Agrawal, R., & Gunopulos, D. (1999). Constraint-based rule mining on large, dense data sets. In Proc. 1999 Int. Conf. Data Engineering (ICDE’99) (pp. 188–197). Washington, District of Columbia: IEEE Computer Society.

    Google Scholar 

  • Chiu, D.-Y., Wu, Y.-H., & Chen, A.L.P. (2004). An efficient algorithm for mining frequent sequences by a new strategy without support counting. In Proc. of the Twentieth IEEE International Conference on Data Engineering (ICDE’04) (pp. 275–286). Boston, Massachusetts: IEEE Computer Society.

    Google Scholar 

  • Garofalakis, M., Rastogi, R., & Shim, K. (1999). SPIRIT: Sequential pattern mining with regular expression constraints. In Proc. 1999 Int. Conf. Very Large Data Bases (VLDB’99) (pp. 223–234). San Francisco, California: Morgan Kaufmann.

    Google Scholar 

  • Grahne, G., Lakshmanan, L., & Wang, X. (2000). Efficient mining of constrained correlated sets. In Proc. 2000 Int. Conf. Data Engineering (ICDE’00) (pp. 512–521). Washington, District of Columbia: IEEE Computer Society.

    Google Scholar 

  • Han, J., Pei, J., Mortazavi-Asl, B., Chen, Q., Dayal, U., & Hsu, M.C. (2000). FreeSpan: Frequent pattern-projected sequential pattern mining. In Proc. 2000 ACM SIGKDD Int. Conf. Knowledge Discovery in Databases (KDD’00) (pp. 355–359). New York: ACM.

    Google Scholar 

  • Kifer, D., Gehrke, J., Bucila, C., & White, W. (2003). How to quickly find a witness. In Proc. 2003 ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS’03) (pp. 272–283). New York: ACM.

    Chapter  Google Scholar 

  • Kum, H.C.M., Pei, J., & Wang, W. (2003). Approxmap : Approximate mining of consensus sequential patterns. In Proc. 2003 SIAM Int. Conf. Data Mining (pp. 311–315). San Francisco, California.

  • Mannila, H., Toivonen, H., & Verkamo, A.I. (1997). Discovery of frequent episodes in event sequences. Data Min. Knowl. Discov., 1:259–289.

    Article  Google Scholar 

  • Ng, R., Lakshmanan, L.V.S., Han, J., & Pang, A. (1998). Exploratory mining and pruning optimizations of constrained associations rules. In Proc. 1998 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD’98) (pp. 13–24). New York: ACM.

    Chapter  Google Scholar 

  • Pei, J., & Han, J. (2000). Can we push more constraints into frequent pattern mining? In Proc. 2000 ACM SIGKDD Int. Conf. Knowledge Discovery in Databases (KDD’00) (pp. 350–354). New York: ACM.

    Chapter  Google Scholar 

  • Pei, J., Han, J., & Lakshmanan, L.V.S. (2001). Mining frequent itemsets with convertible constraints. In Proc. 2001 Int. Conf. Data Engineering (ICDE’01) (pp. 433–442). Washington, District of Columbia: IEEE Computer Society.

    Google Scholar 

  • Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H., Chen, Q., Dayal, U., et al. (2001). PrefixSpan: Mining sequential patterns efficiently by prefix-projected pattern growth. In Proc. 2001 Int. Conf. Data Engineering (ICDE’01) (pp. 215–224). Washington, District of Columbia: IEEE Computer Society.

    Google Scholar 

  • Pei, J., Han, J., & Wang, W. (2002). Constraint-based sequential pattern mining in large databases. In Proc. 2002 Int. Conf. on Information and Knowledge Management (CIKM’02) (pp. 18–25). New York: ACM.

    Chapter  Google Scholar 

  • Pinto, H., Han, J., Pei, J., Wang, K., Chen, Q., & Dayal, U. (2001). Multi-dimensional sequential pattern mining. In Proc. 2001 Int. Conf. Information and Knowledge Management (CIKM’01) (pp. 81–88). New York: ACM.

    Chapter  Google Scholar 

  • Srikant, R., & Agrawal, R. (1996). Mining sequential patterns: Generalizations and performance improvements. In Proc. 5th Int. Conf. Extending Database Technology (EDBT’96) (pp. 3–17). Berlin Heidelberg New York: Springer.

    Google Scholar 

  • Tzvetkov, P., Yan, X., & Han, J. (2003). Tsp: Mining top-k closed sequential patterns. In Proc of the Third IEEE International Conference on Data Mining (ICDM’03) (p. 347). Washington, District of Columbia: IEEE Computer Society.

    Google Scholar 

  • Wang, K., & Tan, J. (1996). Incremental discovery of sequential patterns. In Proc 1996 SIGMOD’96 Workshop Research Issues on Data Mining and Knowledge Discovery (DMKD’96) (pp. 95–102). New York: ACM.

    Google Scholar 

  • Yan, X., Han, J., & Afshar, R. (2003). CloSpan: Mining closed sequential patterns in large databases. In Proc 2003 SIAM Int Conf Data Mining (pp. 406–417). New York: ACM.

    Google Scholar 

  • Yang, J., Yu, P.S., Wang, W., & Han, J. (2002). Mining long sequential patterns in a noisy environment. In Proc 2002 ACM-SIGMOD Int Conf on Management of Data (SIGMOD’02) (pp. 68–75). New York: ACM.

    Google Scholar 

  • Zaki, M.J. (1998). Efficient enumeration of frequent sequences. In Proc. 7th Int. Conf. Information and Knowledge Management (CIKM’98) (pp. 68–75). Washington, District of Columbia.

  • Zaki, M.J. (2001). Spade: An efficient algorithm for mining frequent sequences. Mach. Learn., 42 (1-2),31–60.

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jian Pei.

Additional information

This research is supported in part by NSERC Grant 312194-05, NSF Grants IIS-0308001, IIS-0513678, BDI-0515813 and National Science Foundation of China (NSFC) grants No. 60303008 and 69933010. All opinions, findings, conclusions and recommendations in this paper are those of the authors and do not necessarily reflect the views of the funding agencies.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pei, J., Han, J. & Wang, W. Constraint-based sequential pattern mining: the pattern-growth methods. J Intell Inf Syst 28, 133–160 (2007). https://doi.org/10.1007/s10844-006-0006-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10844-006-0006-z

Keywords

Navigation