New approaches for mining regular high utility sequential patterns

Ishita, Sabrina Zaman; Ahmed, Chowdhury Farhan; Leung, Carson K.

doi:10.1007/s10489-021-02536-7

New approaches for mining regular high utility sequential patterns

Published: 10 July 2021

Volume 52, pages 3781–3806, (2022)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

592 Accesses
31 Citations
Explore all metrics

Abstract

Regular pattern mining has been emerged as one of the promising sub-domains of data mining by discovering patterns with regular occurrences throughout a complete database. In contrast, utility-based pattern mining considers non-binary frequencies of items along with their importance values, and hence reveals more significance than traditional frequent pattern mining. Though regular patterns carry interesting knowledge, considering the utility values of the patterns would unveil more interesting and practical information. In sequence databases, the task of mining regular high utility patterns is more useful and challenging. In the recent time of big data, handling the incremental nature of databases to avoid mining from scratch when new updates appear, will bring effective results in a lot of applications. Moreover, databases can be dynamically updated in the form of data streams where new batches of data are added to the database at a higher rate. A window consisting of several recent batches can be of great interest to some end-users. To address all these important problems, here, we first introduce the concept of regular high utility sequential patterns and develop an algorithm for mining these patterns from static databases. Afterwards, we extend our algorithm to mine regular high utility sequential patterns from incremental databases and sliding-window based data streams. These two approaches produce approximate results in order to generate our intended patters faster and thus boost the performance. Extensive performance analyses of all the algorithms are observed over several real-life datasets and impressive results are found compared to the existing research.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 4

An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique

Article 12 April 2024

Uncertainty in big data analytics: survey, opportunities, and challenges

Article Open access 04 June 2019

A comprehensive survey of data mining

Article 06 February 2020

References

Agrawal R, Mannila H, Srikant R, Toivonen H, Verkamo AI, et al. (1996) Fast discovery of association rules. Adv Knowl Discov Data Min 12(1):307–328
Google Scholar
Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In: ACM sigmod record, vol 29. ACM, pp 1–12
Ayres J, Flannick J, Gehrke J, Yiu T (2002) Sequential PAttern Mining Using a Bitmap Representation. In: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’02. ACM, New York, pp 429–435
Han J, Pei J, Mortazavi-Asl B, Chen Q, Dayal U, Hsu M-C (2000) FreeSpan: frequent pattern-projected sequential pattern mining. In: Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 355–359
Pei J, Han J, Mortazavi-Asl B, Wang J, Pinto H, Chen Q, Dayal U, Hsu M-C (2004) Mining sequential patterns by pattern-growth: the PrefixSpan approach. IEEE Trans Knowl Data Eng 16(11):1424–1440
Article Google Scholar
Srikant R, Agrawal R (1996) Mining Sequential Patterns: Generalizations and Performance Improvements. In: Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology, EDBT ’96. Springer, London, pp 3–17
Zaki MJ (2001) SPADE: An Efficient Algorithm for Mining Frequent Sequences. Mach Learn 42:31–60
Article Google Scholar
Pei J, Han J, Wang W (2002) Mining sequential patterns with constraints in large databases. In: Proceedings of the eleventh international conference on Information and knowledge management, pp 18–25
Pei J, Han J, Wang W (2007) Constraint-based sequential pattern mining: the pattern-growth methods. J Intell Inf Syst 28(2):133–160
Article Google Scholar
Ahmed CF, Tanbeer SK, Jeong B, Lee Y (2009) Efficient Tree Structures for High Utility Pattern Mining in Incremental Databases. IEEE Trans Knowl Data Eng 21(12):1708–1721
Article Google Scholar
Ahmed CF, Tanbeer SK, Jeong BS, Lee YK (2011) HUC-Prune: an efficient candidate pruning technique to mine high utility patterns. Appl Intell 34(2):181–198
Article Google Scholar
Tseng VS, Wu C-W, Shie B-E, Yu PS (2010) UP-Growth: An Efficient Algorithm for High Utility Itemset Mining. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’10. ACM, New York, pp 253–262
Yao H, Hamilton HJ, Butz CJ (2004) A Foundational Approach to Mining Itemset Utilities from Databases. In: Proceedings of the Fourth SIAM International Conference on Data Mining, SDM’04, pp 482–486
Yeh J-S, Li Y-C, Chang C-C (2007) Two-phase Algorithms for a Novel Utility-frequent Mining Model. In: Proceedings of the 2007 International Conference on Emerging Technologies in Knowledge Discovery and Data Mining, PAKDD’07. Springer, Berlin, pp 433–444
Tanbeer SK, Ahmed CF, Jeong BS, Lee YK (2008) Mining Regular Patterns in Transactional Databases. IEICE Trans Inf Syst E91.D(11):2568–2577
Article Google Scholar
Leung C K-S, Khan QI, Li Z, Hoque T (2007) CanTree: a canonical-order tree for incremental frequent-pattern mining. Knowl Inf Syst 11(3):287–311
Article Google Scholar
Tanbeer SK, Ahmed CF, Jeong BS, Lee YK (2009) Efficient single-pass frequent pattern mining using a prefix-tree. Inf Sci 179(5):559–583. Special Section - Quantum Structures: Theory and Applications
Article MathSciNet Google Scholar
Tanbeer SK, Ahmed CF, Jeong BS, Lee YK (2009) Sliding window-based frequent pattern mining over data streams. Inf Sci 179(22):3843–3865
Article MathSciNet Google Scholar
Ahmed CF, Tanbeer SK, Jeong BS (2010) A Novel Approach for Mining High-Utility Sequential Patterns in Sequence Databases. ETRI J 32(5):676–686
Article Google Scholar
Yin J, Zheng Z, Cao L (2012) USpan: An Efficient Algorithm for Mining High Utility Sequential Patterns. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’12. ACM, New York, pp 660–668
Alkan OK, Karagoz P (2016) CRoM and HuspExt: Improving efficiency of high utility sequential pattern extraction. In: 2016 IEEE 32nd International Conference on Data Engineering (ICDE), pp 1472–1473
Duong Q-H, Fournier-Viger P, Ramampiaro H, Nørvåg K, Dam T-L (2018) Efficient high utility itemset mining using buffered utility-lists. Appl Intell 48(7):1859–1877
Article Google Scholar
Fournier-Viger P, Zhang Y, Chun-Wei Lin J, Fujita H, Koh YS (2019) mining local and peak high utility itemsets. Inf Sci 481:344–367
Article MathSciNet Google Scholar
Nguyen LT.T., Vu VV, Lam MTH, Duong TTM, Manh LT, Nguyen TTT, Vo B, Fujita H (2019) an efficient method for mining high utility closed itemsets. Inf Sci 495:78–99
Article Google Scholar
Singh K, Singh SS, Kumar A, Biswas B (2019) Tkeh: an efficient algorithm for mining top-k high utility itemsets. Appl Intell 49(3):1078–1097
Article Google Scholar
Lin J C-W, Pirouz M, Djenouri Y, Cheng C-F, Ahmed U (2020) Incrementally updating the high average-utility patterns with pre-large concept. Appl Intell:1–20
Tin Truong, Duong H, Le B, Fournier-Viger P, Yun U, Fujita H (2021) efficient algorithms for mining frequent high utility sequences with constraints. Information Sciences
Dinh D-T, Le B, Fournier-Viger P, Huynh V-N (2018) An efficient algorithm for mining periodic high-utility sequential patterns. Appl Intell 48(12):4694–4714
Article Google Scholar
Tanbeer SK, Ahmed CF, Jeong BS, Lee YK (2009) Discovering Periodic-Frequent Patterns in Transactional Databases. In: Theeramunkong T, Kijsirikul B, Cercone N, Ho T-B (eds) Advances in Knowledge Discovery and Data Mining. Springer, Berlin, pp 242–253
Lee J, Yun U, Lee G, Yoon E (2018) Efficient incremental high utility pattern mining based on pre-large concept. Eng Appl Artif Intell 72:111–123
Article Google Scholar
Cheng H, Yan X, Han J (2004) IncSpan: incremental mining of sequential patterns in large database. In: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 527–532
Lin C-W, Hong T-P, Lu W-H, Lin W-Y (2008) An Incremental FUSP-tree Maintenance Algorithm. In: Eighth International Conference on Intelligent Systems Design and Applications. IEEE, pp 445–449
Nguyen SN, Sun X, Orlowska ME (2005) Improvements of IncSpan: Incremental mining of sequential patterns in large database. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, pp 442–451
Lin J C-W, Hong T-P, Gan W, Chen H-Y, Li S-T (2015) Incrementally updating the discovered sequential patterns based on pre-large concept. Intell Data Anal 19(5):1071–1089
Article Google Scholar
Leung C K-S, Khan QI (2006) DSTree: a tree structure for the mining of frequent sets from data streams. In: 2006. ICDM’06. Sixth International Conference on Data Mining. IEEE, pp 928–932
Chen G, Wu X, Zhu X (2005) Mining sequential patterns across data streams. Ph.D. Thesis, University of Vermont
Ho C-C, Li H-F, Kuo F-F, Lee S-Y (2006) Incremental mining of sequential patterns over a stream sliding window. In: Data Mining Workshops, 2006. ICDM Workshops 2006. Sixth IEEE International Conference on. IEEE, pp 677–681
Marascu A, Masseglia F (2005) Mining sequential patterns from temporal streaming data. In: Proceedings of the 1st ECML/PKDD Workshop on Mining Spatio-Temporal Data (MSTD 2005), pp 1–13
Raissi C, Poncelet P, Teisseire M (2006) SPEED: mining maximal sequential patterns over data strearns. In: IS: Intelligent Systems, pp 546–552
Chang L, Wang T, Yang D, Luan H (2008) Seqstream: Mining closed sequential patterns over stream sliding windows. In: 2008 Eighth IEEE International Conference on Data Mining, pp 83–92
Tseng VS, Chu C-J, Liang T (2006) Efficient mining of temporal high utility itemsets from data streams. In: Proceedings of Second International Workshop on Utility-Based Data Mining. Citeseer
Ahmed CF, Tanbeer SK, Jeong BS, Choi HJ (2012) Interactive mining of high utility patterns over data streams. Expert Syst Appl 39(15):11979–11991
Article Google Scholar
Ryang H, Yun U (2016) High utility pattern mining over data streams with sliding window technique. Expert Syst Appl 57:214–231
Article Google Scholar
Zihayat M, Wu C-W, An A, Tseng VS, Lin C (2017) Efficiently mining high utility sequential patterns in static and streaming data. Intell Data Anal 21(S1):S103–S135
Article Google Scholar
Zihayat M, Chen Y, An A (2017) Memory-adaptive high utility sequential pattern mining over data streams. Mach Learn 106(6):799–836
Article MathSciNet Google Scholar
Ishita SZ, Ahmed CF, Leung CK, Hoi CHS (2019) Mining regular high utility sequential patterns in static and dynamic databases. In: Proceedings of the 13th International Conference on Ubiquitous Information Management and Communication, IMCOM 2019, Phuket, Thailand, January 4-6, 2019, pp 897–916
Fournier-Viger P, Lin J C-W, Gomariz A, Gueniche T, Soltani A, Deng Z, Lam HT (2016) The SPMF open-source data mining library version 2. In: 19th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD 2016) Part III. Springer, pp 36–40

Download references

Acknowledgements

We would like to express our deep gratitude to the anonymous reviewers of this article. Their useful comments have played a significant role in improving the quality of this work. This work was supported by High-Profile ICT Scholar Fellowship (2017-2018) funded by the Information and Communication Technology (ICT) Division, Ministry of Posts, Telecommunications and Information Technology, Government of the People’s Republic of Bangladesh; Natural Sciences and Engineering Research Council of Canada (NSERC); and, University of Manitoba. A portion of this work has been published earlier [46].

Author information

Authors and Affiliations

Department of Computer Science and Engineering, University of Dhaka, Dhaka, Bangladesh
Sabrina Zaman Ishita & Chowdhury Farhan Ahmed
Department of Computer Science, University of Manitoba, Winnipeg, MB, Canada
Carson K. Leung

Authors

Sabrina Zaman Ishita
View author publications
You can also search for this author in PubMed Google Scholar
Chowdhury Farhan Ahmed
View author publications
You can also search for this author in PubMed Google Scholar
Carson K. Leung
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chowdhury Farhan Ahmed.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ishita, S.Z., Ahmed, C.F. & Leung, C.K. New approaches for mining regular high utility sequential patterns. Appl Intell 52, 3781–3806 (2022). https://doi.org/10.1007/s10489-021-02536-7

Download citation

Accepted: 17 May 2021
Published: 10 July 2021
Issue Date: March 2022
DOI: https://doi.org/10.1007/s10489-021-02536-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

New approaches for mining regular high utility sequential patterns

Abstract

Access this article

Similar content being viewed by others

An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique

Uncertainty in big data analytics: survey, opportunities, and challenges

A comprehensive survey of data mining

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

New approaches for mining regular high utility sequential patterns

Abstract

Access this article

Similar content being viewed by others

An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique

Uncertainty in big data analytics: survey, opportunities, and challenges

A comprehensive survey of data mining

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation