Abstract
Nowadays, raw data is rarely used directly. In real world applications, data is often processed, and the necessary knowledge extracted, depending on the purpose of the user. Applying constraints in pattern mining is a major factor in reducing the resulting patterns to help decision support systems work efficiently. In 2018, a constraint-based approach was developed to discover inter-sequence patterns. However, this method only focused on the constraints with single items. The task of discovering constraint-based inter-sequential patterns is our target in this work. We propose the DBV-ISPMIC algorithm, a DBV-PatternList based structure, for mining inter-sequential patterns with itemset constraints. The proposed algorithm utilizes an organized search tree structure stored as dynamic bit vectors to quickly compute the support of patterns. In addition, we also develop a property and, based on it, an improved algorithm is proposed to reduce checking candidates. Finally, we develop the pDBV-ISPMIC algorithm as a parallel method of the DBV-ISPMIC algorithm. Empirical evaluations show that DBV-ISPMIC has better performance than the post-processing algorithms in experimental databases and pDBV-ISPMIC is better than DBV-ISPMIC with regard to the runtime.
Similar content being viewed by others
References
Huynh HM, Nguyen LTT, Vo B, Nguyen A, Tseng VS (Mar. 2020) Efficient methods for mining weighted clickstream patterns. Expert Syst Appl 142:112993. https://doi.org/10.1016/j.eswa.2019.112993
Agrawal R, Srikant R (1995) Mining sequential patterns. In: Proceedings - International Conference on Data Engineering, pp. 3–14, https://doi.org/10.1109/icde.1995.380415
Zaki MJ (2001) SPADE: an efficient algorithm for mining frequent sequences. Mach Learn 42(1–2):31–60. https://doi.org/10.1023/A:1007652502315
Han J, Pei J, Mortazavi-Asl B, Chen Q, Dayal U, Hsu M-C (2000) FreeSpan: frequent pattern-projected sequential pattern mining. In: Proceedings - the sixth ACM SIGKDD international conference on Knowledge discovery and data mining - KDD ‘00, pp. 355–359, https://doi.org/10.1145/347090.347167
Pei J et al (2001) PrefixSpan: Mining sequential patterns efficiently by prefix-projected pattern growth. In: Proceedings - International Conference on Data Engineering, pp. 215–224, https://doi.org/10.1109/icde.2001.914830
Fournier-Viger P, Gomariz A, Campos M, Thomas R Fast vertical mining of sequential patterns using co-occurrence information. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2014, vol. 8443 LNAI, no. PART 1, pp. 40–52, https://doi.org/10.1007/978-3-319-06608-0_4
Wang CS, Lee AJT (May 2009) Mining inter-sequence patterns. Expert Syst Appl 36(4):8649–8658. https://doi.org/10.1016/j.eswa.2008.10.008
Le T, Nguyen A, Huynh B, Vo B, Pedrycz W (May 2018) Mining constrained inter-sequence patterns: a novel approach to cope with item constraints. Appl Intell 48(5):1327–1343. https://doi.org/10.1007/s10489-017-1123-9
Vo B, Tran MT, Hong TP, Nguyen H, Le B (2012) A dynamic bit-vector approach for efficiently mining inter-sequence patterns. In: Proceedings - 3rd International Conference on Innovations in Bio-Inspired Computing and Applications, IBICA 2012, pp. 51–56, https://doi.org/10.1109/IBICA.2012.31
Le B, Tran MT, Vo B (Jul. 2015) Mining frequent closed inter-sequence patterns efficiently using dynamic bit vectors. Appl Intell 43(1):74–84. https://doi.org/10.1007/s10489-014-0630-1
Wang CS, Liu YH, Chu KC (Jun. 2013) Closed inter-sequence pattern mining. J Syst Softw 86(6):1603–1612. https://doi.org/10.1016/J.JSS.2013.02.010
Liao W, Wang Q, Yang L, Ren J, Davis DN, Hu C (Apr. 2018) Mining frequent intra-sequence and inter-sequence patterns using bitmap with a maximal span, Proc. - 2017 14th web Inf. Syst Appl Conf WISA 2017, vol 2018-January, pp 56–61, https://doi.org/10.1109/WISA.2017.70
Van T, Le B (Mar. 2021) Mining sequential rules with itemset constraints. Appl Intell 51:1–13. https://doi.org/10.1007/s10489-020-02153-w
Van T, Vo B, Le B (Nov. 2018) Mining sequential patterns with itemset constraints. Knowl Inf Syst 57(2):311–330. https://doi.org/10.1007/s10115-018-1161-6
Gouda K, Hassaan M, Zaki MJ (2007) PRISM: A prime-encoding approach for frequent sequence mining. In: Proceedings - IEEE International Conference on Data Mining, ICDM, pp. 487–492, https://doi.org/10.1109/ICDM.2007.33
Gouda K, Hassaan M, Zaki MJ (Feb. 2010) Prism: an effective approach for frequent sequence mining via prime-block encoding. J Comput Syst Sci 76(1):88–102. https://doi.org/10.1016/J.JCSS.2009.05.008
Huynh HM, Nguyen LTT, Vo B, Yun U, Oplatková ZK, Hong TP (Jun. 2020) Efficient algorithms for mining clickstream patterns using pseudo-IDLists. Futur Gener Comput Syst 107:18–30. https://doi.org/10.1016/j.future.2020.01.034
Huynh HM, Nguyen LTT, Vo B, Oplatková ZK, Fournier-Viger P, Yun U (Jan. 2022) An efficient parallel algorithm for mining weighted clickstream patterns. Inf Sci (NY) 582:349–368. https://doi.org/10.1016/J.INS.2021.08.070
Gan W, Lin JCW, Zhang J, Fournier-Viger P, Chao HC, Yu PS (Feb. 2021) Fast utility mining on sequence data. IEEE Trans Cybern 51(2):487–500. https://doi.org/10.1109/TCYB.2020.2970176
Gan W et al (May 2021) Utility Mining Across Multi-Dimensional Sequences. ACM Trans Knowl Discov Data 15(5):1–24. https://doi.org/10.1145/3446938
Lin JCW, Li Y, Fournier-Viger P, Djenouri Y, Zhang J (2020) Efficient chain structure for high-utility sequential pattern mining. IEEE Access 8:40714–40722. https://doi.org/10.1109/ACCESS.2020.2976662
Gan W, Lin JCW, Zhang J, Chao HC, Fujita H, Yu PS (Mar. 2020) ProUM: Projection-based utility mining on sequence data. Inf Sci (NY) 513:222–240. https://doi.org/10.1016/J.INS.2019.10.033
Wu Y, Geng M, Li Y, Guo L, Li Z, Fournier-Viger P, Zhu X, Wu X (Oct. 2021) HANP-miner: high average utility nonoverlapping sequential pattern mining. Knowledge-Based Syst 229:107361. https://doi.org/10.1016/J.KNOSYS.2021.107361
Chun-wei Lin J et al (Nov. 2021) Scalable Mining of High-Utility Sequential Patterns with Three-Tier MapReduce model. ACM Trans Knowl Discov Data 16(3):1–26. https://doi.org/10.1145/3487046
Truong T, Duong H, Le B, Fournier-Viger P, Yun U, Fujita H (Aug. 2021) Efficient algorithms for mining frequent high utility sequences with constraints. Inf Sci (NY) 568:239–264. https://doi.org/10.1016/J.INS.2021.01.060
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Nguyen, A., Nguyen, NT., Nguyen, L.T. et al. Mining inter-sequence patterns with Itemset constraints. Appl Intell 53, 19827–19842 (2023). https://doi.org/10.1007/s10489-023-04514-7
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-023-04514-7