Skip to main content
Log in

High-utility sequential pattern mining in incremental database

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Previous algorithms designed for efficient mining of sequence patterns have primarily focused on processing static databases. However, in the context of dynamic database mining, where new data are constantly added, rescanning the entire database to update the information becomes necessary. This maintenance and update process consumes significant time and resources, leading to delayed responses. To address this issue, this paper proposes an incremental mining algorithm called Pre-HUSPM, which leverages the concept of pre-large to insert new sequences into the dynamic database while preserving the discovered efficient sequence patterns. Furthermore, a novel threshold, denoted as \(SWU_{max}\), is introduced to minimize the frequency of database rescans and enhance the algorithm’s speed. The experimental results show that the algorithm greatly reduces computation time and resource consumption, enabling the algorithm to respond faster to data changes and generate new mining results. This algorithm aids manufacturers in designing and producing products that align with customer preferences based on previous products, thereby improving operational efficiency and guiding customers toward wise purchasing decisions, ultimately resulting in higher profits for the company.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Algorithm 1
Algorithm 2
Algorithm 3
Algorithm 4
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Data availability

No datasets were generated or analyzed during the current study.

References

  1. Agrawal R, Imielinski T, Swami A (1993) Database mining: a performance perspective. IEEE Trans Knowl Data Eng 5(6):914–925

    Article  Google Scholar 

  2. Agrawal R, Imieliński T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, pp 207–216

  3. Agrawal R, Srikant R et al (1994) Fast algorithms for mining association rules. In: Proceedings of 20th International Conference on Very Large Data Bases, VLDB, vol 1215. Citeseer, pp 487–499

  4. Fournier-Viger P, Lin JC-W, Kiran RU, Koh YS, Thomas R (2017) A survey of sequential pattern mining. Data Sci Pattern Recognit 1(1):54–77

    Google Scholar 

  5. Gupta S, Chakrabarty D, Kumar R (2023) Predicting Indian electricity exchange-traded market prices: SARIMA and MLP approach. OPEC Energy Rev 47(4):271–286

    Article  Google Scholar 

  6. Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. ACM SIGMOD Rec 29(2):1–12

    Article  Google Scholar 

  7. Xun Y, Cui X, Zhang J, Yin Q (2021) Incremental frequent itemsets mining based on frequent pattern tree and multi-scale. Expert Syst Appl 163:113805

    Article  Google Scholar 

  8. Wu JM-T, Li R, Wu M-E, Lin JC-W (2023) Mining skyline frequent-utility patterns from big data environment based on mapreduce framework. Intell Data Anal 27(5):1359–1377. https://doi.org/10.3233/IDA-220756

    Article  Google Scholar 

  9. Chan R, Yang Q, Shen Y-D (2003) Mining high utility itemsets. In: Third IEEE International Conference on Data Mining. IEEE Computer Society, pp 19–19

  10. Yen S-J, Lee Y-S (2007) Mining high utility quantitative association rules. In: International Conference on Data Warehousing and Knowledge Discovery. Springer, pp 283–292

  11. Ahmed CF, Tanbeer SK, Jeong B-S, Lee Y-K (2009) Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans Knowl Data Eng 21(12):1708–1721

    Article  Google Scholar 

  12. Tseng VS, Wu C-W, Shie B-E, Yu PS (2010) Up-growth: an efficient algorithm for high utility itemset mining. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 253–262

  13. Krishnamoorthy S (2015) Pruning strategies for mining high utility itemsets. Expert Syst Appl 42(5):2371–2381

    Article  Google Scholar 

  14. Wu J, Li R, Hsu P-C, Wu M-E (2023) The effective skyline quantify-utility patterns mining algorithm with pruning strategies. Comput Sci Inf Syst 20:40–40. https://doi.org/10.2298/CSIS220615040W

    Article  Google Scholar 

  15. Liu Y, Liao W-k, Choudhary A (2005) A two-phase algorithm for fast discovery of high utility itemsets. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, pp 689–695

  16. Lin C-W, Hong T-P, Lu W-H (2011) An effective tree structure for mining high utility itemsets. Expert Syst Appl 38(6):7419–7424

    Article  Google Scholar 

  17. Wu P, Niu X, Fournier-Viger P, Huang C, Wang B (2022) Ubp-miner: an efficient bit based high utility itemset mining algorithm. Knowl-Based Syst 248:108865

    Article  Google Scholar 

  18. Cheng Z, Fang W, Shen W, Lin JC-W, Yuan B (2023) An efficient utility-list based high-utility itemset mining algorithm. Appl Intell 53(6):6992–7006

    Article  Google Scholar 

  19. Yun U, Ryang H, Lee G, Fujita H (2017) An efficient algorithm for mining high utility patterns from incremental databases with one database scan. Knowl-Based Syst 124:188–206

    Article  Google Scholar 

  20. Liu J, Wang K, Fung BC (2012) Direct discovery of high utility itemsets without candidate generation. In: 2012 IEEE 12th International Conference on Data Mining. IEEE, pp 984–989

  21. Liu M, Qu J (2012) Mining high utility itemsets without candidate generation. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, pp 55–64

  22. Wu JM-T, Srivastava G, Lin JC-W, Djenouri Y, Wei M, Parizi RM, Khan MS (2021) Mining of high-utility patterns in big IoT-based databases. Mob Netw Appl 26(1):216–233

    Article  Google Scholar 

  23. Lin C-W, Hong T-P, Lu W-H (2010) Efficiently mining high average utility itemsets with a tree structure. In: Asian Conference on Intelligent Information and Database Systems. Springer, pp 131–139

  24. Wu JM-T, Teng Q, Lin JC-W, Cheng C-F (2020) Incrementally updating the discovered high average-utility patterns with the pre-large concept. IEEE Access 8:66788–66798

    Article  Google Scholar 

  25. Wu JM-T, Li Z, Srivastava G, Yun U, Lin JC-W (2022) Analytics of high average-utility patterns in the industrial internet of things. Appl Intell 52(6):6450–6463

    Article  Google Scholar 

  26. Fournier-Viger P, Wu C-W, Tseng VS (2012) Mining top-k association rules. In: Canadian Conference on Artificial Intelligence. Springer, pp 61–73

  27. Nouioua M, Fournier-Viger P, Wu C-W, Lin JC-W, Gan W (2021) Fhuqi-miner: fast high utility quantitative itemset mining. Appl Intell 51(10):6785–6809

    Article  Google Scholar 

  28. Nouioua M, Fournier-Viger P, Qu J-F, Lin JC-W, Gan W, Song W (2021) Chuqi-miner: mining correlated quantitative high utility itemsets. In: 2021 International Conference on Data Mining Workshops (ICDMW). IEEE, pp 599–606

  29. Srikant R, Agrawal R (1996) Mining sequential patterns: generalizations and performance improvements. In: International Conference on Extending Database Technology. Springer, pp 1–17

  30. Ahmed CF, Tanbeer SK, Jeong B-S (2010) Mining high utility web access sequences in dynamic web log data. In: 2010 11th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing. IEEE, pp 76–81

  31. Ahmed CF, Tanbeer SK, Jeong B-S (2010) A novel approach for mining high-utility sequential patterns in sequence databases. ETRI J 32(5):676–686

    Article  Google Scholar 

  32. Yin J, Zheng Z, Cao L (2012) Uspan: an efficient algorithm for mining high utility sequential patterns. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 660–668

  33. Lee G, Chen Y-C, Hung K-C (2013) Ptree: mining sequential patterns efficiently in multiple data streams environment. J Inf Sci Eng 29(6):1151–1169

    Google Scholar 

  34. Zhang C, Du Z, Gan W, Philip SY (2021) Tkus: mining top-k high utility sequential patterns. Inf Sci 570:342–359

    Article  MathSciNet  Google Scholar 

  35. Hong T-P, Wang C-Y, Tao Y-H (2001) A new incremental data mining algorithm using pre-large itemsets. Intell Data Anal 5(2):111–129

    Article  Google Scholar 

  36. Lin JC-W, Hong T-P, Gan W, Chen H-Y, Li S-T (2015) Incrementally updating the discovered sequential patterns based on pre-large concept. Intell Data Anal 19(5):1071–1089

    Article  Google Scholar 

  37. Lin C-W, Hong T-P, Lan G-C, Wong J-W, Lin W-Y (2014) Incrementally mining high utility patterns based on pre-large concept. Appl Intell 40(2):343–357

    Article  Google Scholar 

  38. Lin JC-W, Pirouz M, Djenouri Y, Cheng C-F, Ahmed U (2020) Incrementally updating the high average-utility patterns with pre-large concept. Appl Intell 50(11):3788–3807

    Article  Google Scholar 

  39. Zhang B, Lin JC-W, Fournier-Viger P, Li T (2017) Mining of high utility-probability sequential patterns from uncertain databases. PLoS ONE 12(7):0180931

    Article  Google Scholar 

  40. Pei J, Han J, Mortazavi-Asl B, Wang J, Pinto H, Chen Q, Dayal U, Hsu M-C (2004) Mining sequential patterns by pattern-growth: the prefixspan approach. IEEE Trans Knowl Data Eng 16(11):1424–1440

    Article  Google Scholar 

  41. Li T, Xu T, Dong X (2017) Hunspm: an efficient algorithm for mining high utility negative sequential patterns. In: 2017 13th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD). IEEE, pp 1833–1837

  42. Lan G-C, Hong T-P, Tseng VS, Wang S-L (2014) Applying the maximum utility measure in high utility sequential pattern mining. Expert Syst Appl 41(11):5071–5081

    Article  Google Scholar 

  43. Wang J-Z, Huang J-L (2018) On incremental high utility sequential pattern mining. ACM Trans Intell Syst Technol. https://doi.org/10.1145/3178114

    Article  Google Scholar 

  44. Saleti S (2021) Incremental mining of high utility sequential patterns using mapreduce paradigm. Clust Comput 25:805–825

    Article  Google Scholar 

  45. Wu JM-T, Teng Q, Lin JC-W, Yun U, Chen H-C (2020) Updating high average-utility itemsets with pre-large concept. J Intell Fuzzy Syst 38(5):5831–5840

    Article  Google Scholar 

  46. Fournier-Viger P, Lin JC-W, Gomariz A, Gueniche T, Soltani A, Deng Z, Lam HT (2016) The spmf open-source data mining library version 2. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, pp 36–40

Download references

Author information

Authors and Affiliations

Authors

Contributions

H. helped in conceptualization, methodology, software, investigation, formal analysis, visualization, writing—original draft; F. was involved in data curation, visualization, writing—original draft; M. performed investigation, supervision; J., corresponding author, helped in conceptualization, resources, project administration, supervision, writing—review and editing.

Corresponding author

Correspondence to Jimmy Ming-Tai Wu.

Ethics declarations

Conflict of interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yan, H., Li, F., Hsieh, MC. et al. High-utility sequential pattern mining in incremental database. J Supercomput 81, 81 (2025). https://doi.org/10.1007/s11227-024-06568-x

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11227-024-06568-x

Keywords

Navigation