Abstract
Sequential pattern mining is a challenging problem that has received much attention in the past few decades. The mining of large sequential databases can be very time consuming and produces a large number of unrelated patterns that must be evaluated. In this paper, we explore the problems of frequent prefix, prefix-closed, and prefix-maximal pattern mining along with their suffix variants. By constraining the pattern mining task, we are able to reduce the mining time required while obtaining patterns of interest. We introduce notations related to prefix/suffix sequential pattern mining while providing theorems and proofs that are key to our proposed algorithms. We show that the use of projected databases can greatly reduce the time required to mine the complete set of frequent prefix/suffix patterns, prefix/suffix-closed patterns, and prefix/suffix-maximal patterns. Theoretical analysis shows that our approach is better than the current existing approach, and empirical analysis on various datasets is used to support these conclusions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Agrawal, R., Srikant, R.: Mining sequential patterns. In: Proceedings of the Eleventh International Conference on Data Engineering (1995)
Huang, C.L., Huang, W.L.: Handling sequential pattern decay: developing a two-stage collaborative recommender system. Electron. Commer. Res. Appl. 8(3), 117–129 (2009)
Yap, G.-E., Li, X.-L., Yu, P.S.: Effective next-items recommendation via personalized sequential pattern mining. In: Lee, S., et al. (eds.) DASFAA 2012. LNCS, vol. 7239, pp. 48–64. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-29035-0_4
Baralis, E., et al.: Analysis of medical pathways by means of frequent closed sequences. In: Setchi, R., Jordanov, I., Howlett, R.J., Jain, L.C. (eds.) KES 2010. LNCS (LNAI), vol. 6278, pp. 418–425. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15393-8_47
Uragaki, K., Hosaka, T., Arahori, Y., Kushima, M., Yamazaki, T., Araki, K., Yokota, H.: Sequential pattern mining on electronic medical records with handling time intervals and the efficacy of medicines. In: 2016 IEEE Symposium on Computers and Communication (ISCC) (2016)
Aloysius, G., Binu, D.: An approach to products placement in supermarkets using PrefixSpan algorithm. J. King Saud Univ.-Comput. Inf. Sci. 25(1), 77–87 (2013)
Shim, B., Choi, K., Suh, Y.: CRM strategies for a small-sized online shopping mall based on association rules and sequential patterns. Expert Syst. Appl. 39(9), 7736–7742 (2012)
Chen, Y.L., Hu, Y.H.: Constraint-based sequential pattern mining: the consideration of recency and compactness. Dec. Support Syst. 42(2), 1203–1215 (2006)
Antunes, C., Oliveira, A.L.: Generalization of pattern-growth methods for sequential pattern mining with gap constraints. In: Perner, P., Rosenfeld, A. (eds.) MLDM 2003. LNCS, vol. 2734, pp. 239–251. Springer, Heidelberg (2003). https://doi.org/10.1007/3-540-45065-3_21
Li, C., Wang, J.: Efficiently mining closed subsequences with gap constraints. In: Proceedings of the 2008 SIAM International Conference on Data Mining (2008)
Srikant, R., Agrawal, R.: Mining sequential patterns: generalizations and performance improvements. In: Apers, P., Bouzeghoub, M., Gardarin, G. (eds.) EDBT 1996. LNCS, vol. 1057, pp. 1–17. Springer, Heidelberg (1996). https://doi.org/10.1007/BFb0014140
Kaytoue, M., Pitarch, Y., Plantevit, M., Robardet, C.: What effects topological changes in dynamic graphs? Elucidating relationships between vertex attributes and the graph structure. Soc. Netw. Anal. Min. 5, 55 (2015)
Zaki, M.J.: SPADE: an efficient algorithm for mining frequent sequences. Mach. Learn. 42(1–2), 31–60 (2001)
Ayres, J., Flannick, J., Gehrke, J., Yiu, T.: Sequential pattern mining using a bitmap representation. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2002)
Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H., Chen, Q., Dayal, U., Hsu, M.C.: PrefixSpan: mining sequential patterns efficiently by prefix-projected pattern growth. In: Proceedings of the 17th International Conference on Data Engineering (2001)
Wang, J., Han, J.: BIDE: efficient mining of frequent closed sequences. In: Proceedings of the 20th International Conference on Data Engineering (2004)
Yan, X., Han, J., Afshar, R.: CloSpan: mining closed sequential patterns in large datasets. In: Proceedings of the 2003 SIAM International Conference on Data Mining (2003)
Fournier-Viger, P., Lin, J.C.W., Kiran, R.U., Koh, Y.S., Thomas, R.: A survey of sequential pattern mining. Data Sci. Pattern Recogn. 1(1), 54–77 (2017)
Fournier-Viger, P., et al.: The SPMF open-source data mining library version 2. In: Berendt, B., et al. (eds.) ECML PKDD 2016. LNCS (LNAI), vol. 9853, pp. 36–40. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46131-1_8
Chen, D., Sain, S.L., Guo, K.: Data mining for the online retail industry: a case study of RFM model-based customer segmentation using data mining. J. Database Mark. Cust. Strategy Manag. 19(3), 197–208 (2012)
Neidle, C.: SignStream\(^{\rm TM}\): a database tool for research on visual-gestural language. Sign Lang. Linguist. 4, 203–214 (2001)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Singh, R., Graves, J.A., Talbert, D.A., Eberle, W. (2018). Prefix and Suffix Sequential Pattern Mining. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2018. Lecture Notes in Computer Science(), vol 10934. Springer, Cham. https://doi.org/10.1007/978-3-319-96136-1_24
Download citation
DOI: https://doi.org/10.1007/978-3-319-96136-1_24
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-96135-4
Online ISBN: 978-3-319-96136-1
eBook Packages: Computer ScienceComputer Science (R0)