Skip to main content
Log in

Mining sequential patterns with itemset constraints

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Mining sequential patterns is used to discover all the frequent sequences in a sequence database. However, the mining may return a huge number of patterns, while the users are only interested in a particular subset of these. In this paper, we consider the problem of mining sequential patterns with itemset constraints. In order to solve this problem, we propose a new algorithm named MSPIC-DBV, which is a pattern-growth algorithm that uses prefixes and dynamic bit vectors. This algorithm prunes the search space at the beginning and during the mining process. Moreover, it reduces the number of candidates that need to be checked. The experimental results show that the proposed algorithm outperforms the previous methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Agrawal R, Srikant R (1995) Mining sequential patterns. In: The 11th international conference on data engineering, pp 3–14

  2. Ayres J, Gehrke JE, Yiu T, Flannick J (2002) Sequential pattern mining using a bitmap representation. In: The 8th ACM SIGKDD international conference on knowledge discovery and data mining, pp 429–435

  3. Chen E, Cao H, Li Q, Qian T (2008) Efficient strategies for tough aggregate constraint-based sequential pattern mining. Inf Sci 176(1):1498–1518

    Article  MathSciNet  MATH  Google Scholar 

  4. Chen YL, Hu YH (2006) Constraint-based sequential pattern mining: the consideration of recency and compactness. Decis Support Syst 42(2):1203–1215

    Article  Google Scholar 

  5. Chen J, Gu J, Yang, Qiao Z (2010) Efficient strategies for average constraint-based sequential pattern mining. In: The 2010 international conference on multimedia communications, pp 254–257

  6. de Amo Sandra, Furtado DA (2007) First-order temporal pattern mining with regular expression constraints. Data Knowl Eng 62(3):401–420

    Article  Google Scholar 

  7. Fumarola F Pasqua, Fabiana Lanotte PF, Ceci M, Malerba D (2016) CloFAST: closed sequential pattern mining using sparse and vertical id-lists. Knowl Inf Syst 48(2):429–463

    Article  Google Scholar 

  8. Garofalakis MN, Rastogi R, Shim K (1999) SPIRIT: Sequential pattern mining with regular expression constraints. In: The 25th international conference on very large data bases, pp 7–10

  9. Gouda K, Hassaan M, Zaki MJ (2010) Prism: a primal-encoding approach for frequent sequence mining. J Comput Syst Sci 76(1):88–102

    Article  MATH  Google Scholar 

  10. Han J, Pei J, Mortazavi-Asl B, Chen Q, Dayal U, Hsu M-C Freespan (2000) Frequent pattern projected sequential pattern mining. In: The 6th ACM SIGKDD international conference on knowledge discovery and data mining, pp 355–359

  11. Ho J, Lukov L, Chawla S (2005) Sequential pattern mining with constraints on large protein databases. In: The 12th international conference on management of data (COMAD 2005), pp 89–100

  12. https://www.mediafire.com/folder/nebwi57vp4gjw/Synthetic_DB

  13. Kohavi R, Brodley C, Frasca B, Mason L, Zheng Z (2000) KDD-Cup 2000 organizers’ report: peeling the onion. SIGKDD Explor 2(2):86–98

    Article  Google Scholar 

  14. Le B, Tran MT, Vo B (2015) Mining frequent closed inter-sequence patterns efficiently using dynamic bit vectors. Appl Intell 43(1):74–84

    Article  Google Scholar 

  15. Liao VCC, Chen MS (2014) DFSP: a depth-first spelling algorithm for sequential pattern mining of biological sequences. Knowl Inf Syst 38(3):623–639

    Article  Google Scholar 

  16. Lin MY, Lee SY (2005) Efficient mining of sequential patterns with time constraints by delimited pattern growth. Knowl Inf Syst 7(4):499–514

    Article  Google Scholar 

  17. Lo D, Khoo SC, Li, J: Mining and ranking generators of sequential patterns. In: The 9th SIAM international conference on data mining, pp 553–564 (2008)

  18. Mallick B, Garg D, Grover PS (2014) Constraint-based sequential pattern mining: a pattern growth algorithm incorporating compactness, length and monetary. Int Arab J Inf Technol 11(1):33–42

    Google Scholar 

  19. Masseglia F, Poncelet P, Teisseire M (2009) Efficient mining of sequential patterns with time constraints: reducing the combinations. Expert Syst Appl 36(2):2677–2690

    Article  Google Scholar 

  20. Orlando S, Perego R, Silvestri C (2004) A new algorithm for gap constrained sequence mining. In: The 2004 ACM symposium on applied computing, pp 540–547

  21. Orlando S, Perego R, Silvestri C (2004) A new algorithm for gap constrained sequence mining. In: The ACM symposium on applied computing (SAC), pp 540–547

  22. Pei J et al (2004) Mining sequential patterns by pattern-growth: the PrefixSpan approach. IEEE Trans Knowl Eng 16(11):1424–1440

    Article  Google Scholar 

  23. Pei J, Han J, Wang W (2007) Constraint-based sequential pattern mining: the pattern-growth methods. J Intell Inf Syst 28(2):133–160

    Article  Google Scholar 

  24. Pokou JM, Fournier-Viger P, Moghrabi C (2016) Authorship attribution using small sets of frequent part-of-speech skip-grams. In: The international Florida artificial intelligence research society conference, pp 86–91

  25. Senkul P, Salin S (2012) Improving pattern quality in web usage mining by using semantic information. Knowl Inf Syst 30(3):527–541

    Article  Google Scholar 

  26. Srikant R, Agrawal R (1996) Mining sequential patterns: generalizations and performance improvements. In: The 5th international conference on extending database technology, pp 3–17

  27. Tran MT, Le B, Vo B (2015) Combination of dynamic bit vectors and transaction information for mining frequent closed sequences efficiently. Eng Appl Artif Intell 38:183–189

    Article  Google Scholar 

  28. Tran MT, Le B, Vo B, Hong TP (2016) Mining non-redundant sequential rules with dynamic bit vectors and pruning techniques. Appl Intell 45(2):333–342

    Article  Google Scholar 

  29. Tsai CY, Lai BH (2015) A location-item-time sequential pattern mining algorithm for route recommendation. Knowl Based Syst 73:97–110

    Article  Google Scholar 

  30. Van TT, Vo B, Le B (2014) IMSR_PreTree: an improved algorithm for mining sequential rules based on the prefix-tree. Vietnam J Comput Sci 1(2):97–105

    Article  Google Scholar 

  31. Vo B, Hong TP, Le B (2012) DBV-miner: a dynamic bit-vector approach for fast mining frequent closed itemsets. Expert Syst Appl 39(8):7196–7206

    Article  Google Scholar 

  32. Vo B, Tran MT, Nguyen H, Hong TP, Le B (2012) A dynamic bit-vector approach for efficiently mining inter-sequence patterns. In: 2012 third international conference on innovations in bio-inspired computing and applications (IBICA), pp 51–56

  33. Yen SJ, Lee YS (2004) Mining sequential patterns with item constraints. In: Data warehousing and knowledge discovery, pp 381–390

  34. Yun U, Ryu KH (2010) Discovering important sequential patterns with length-decreasing weighted support constraints. Int J Inf Technol Decis Mak 9(4):575–599

    Article  MATH  Google Scholar 

  35. Zaki MJ (2000) Sequence mining in categorical domains: incorporating constraints. In: The 9th international conference on information and knowledge management. ACM, pp 422–429

  36. Zaki MJ (2000) SPADE: an efficient algorithm for mining frequent sequences. Mach Learn J 42(1/2):31–60

    Article  MATH  Google Scholar 

  37. Zhang J, Wang Y, Yang D (2015) CCSpan: mining closed contiguous sequential patterns. Knowl Based Syst 89:1–13

    Article  Google Scholar 

  38. Zhang J, Wang Y, Zhang C, Shi Y (2016) Mining contiguous sequential generators in biological sequences. IEEE/ACM Trans Comput Biol Bioinform 13(5):855–867

    Article  Google Scholar 

Download references

Acknowledgements

This research is funded by Vietnam National Foundation for Science and Technology Development (NAFOSTED) under Grant Number 102.05-2015.07.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bac Le.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (rar 573 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Van, T., Vo, B. & Le, B. Mining sequential patterns with itemset constraints. Knowl Inf Syst 57, 311–330 (2018). https://doi.org/10.1007/s10115-018-1161-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-018-1161-6

Keywords

Navigation