Skip to main content
Log in

Mining of High-Utility Patterns in Big IoT-based Databases

  • Published:
Mobile Networks and Applications Aims and scope Submit manuscript

Abstract

When focusing on the general area of data mining, high-utility itemset mining (HUIM) can be defined as an offset of frequent itemset mining (FIM). It is known to emphasize more factors critically, which gives HUIM its intrinsic edge. Due to the flourishing development of the IoT technique, the uncertainty patterns mining is also attractive. Potential high-utility itemset mining (PHUIM) is introduced to reveal valuable patterns in an uncertainty database. Unfortunately, even though the previous methods are all very effective and powerful to mine, the potential high-utility itemsets quickly. These algorithms are not specifically designed for a database with an enormous number of records. In the previous methods, uncertainty transaction datasets would be load in the memory ultimately. Usually, several pre-defined operators would be applied to modify the original dataset to reduce the seeking time for scanning the data. However, it is impracticable to apply the same way in a big-data dataset. In this work, a dataset is assumed to be too big to be loaded directly into memory and be duplicated or modified; then, a MapReduce framework is proposed that can be used to handle these types of situations. One of our main objectives is to attempt to reduce the frequency of dataset scans while still maximizing the parallelization of all processes. Through in-depth experimental results, the proposed Hadoop algorithm is shown to perform strongly to mine all of the potential high-utility itemsets in a big-data dataset and shows excellent performance in a Hadoop computing cluster.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Aggarwal CC, Li Y, Wang J, Wang J (2009) Frequent pattern mining with uncertain data. In: ACM SIGKDD international conference on knowledge discovery and data mining , pp 29–38

  2. Aggarwal CC, Yu PS (2009) A survey of uncertain data algorithms and applications. IEEE Trans Knowl Data Eng 21(5):609–623

    Article  Google Scholar 

  3. Agrawal R, Imielinski T, Swami A (1993) Database mining: A performance perspective. IEEE Trans Knowl Data Eng 5(6):914–925

    Article  Google Scholar 

  4. Agrawal R, Imielinski T, Swami A (1993) Mining association rules between sets of items in large database. In: ACM SIGMOD international conference on management of data, pp 207–216

  5. Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: The international conference on very large data bases, pp 487–499

  6. Ahmed CF, Tanbeer SK, Jeong BS, Le YK (2009) Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans Knowl Data Eng 21(12):1708–1721

    Article  Google Scholar 

  7. Ahmed U, Lin JCW, Srivastava G, Yasin R, Djenouri Y (2020) An evolutionary model to mine high expected utility patterns from uncertain databases. In: IEEE transactions on emerging topics in computational intelligence

  8. Bernecker T, Kriegel HP, Renz M, Verhein F, Zuefl A (2009) Probabilistic frequent itemset mining in uncertain databases. In: ACM SIGKDD international conference on knowledge discovery and data mining, pp 119–128

  9. Braun P, Cuzzocrea A, Leung CK, Pazdor AG, Souza J, Tanbeer SK (2019) Pattern mining from big IoT data with fog computing: models, issues, and research perspectives. In: IEEE/ACM international symposium on cluster cloud and grid computing, pp 584–591

  10. Chan R, Yang Q, Shen YD (2003) Mining high utility itemsets. In: IEEE international conference on data mining, pp 19–26

  11. Chen MS, Han J, Yu PS (1996) Data mining: an overview from a database perspective. IEEE Trans Knowl Data Eng 8(6):866–883

    Article  Google Scholar 

  12. Chui CK, Kao B, Hung E (2007) Mining frequent itemsets from uncertain data. In: The pacific-asia conference on knowledge discovery and data mining, pp 47–58

  13. Fournier-Viger P, Wu CW, Zida S, Tseng VS (2014) FHM: Faster high-utility itemset mining using estimated utility co-occurrence pruning. Int Symp Methodol Intell Syst 8502:83–92

    Google Scholar 

  14. Fournier-Viger P, Lin CW, Gomariz A, Gueniche T, Soltani A, Deng Z, Lam HT (2016) The SPMF open-source data mining library version 2. Europ Conf Princip Data Mining Knowl Dis 9853:36–40

    Google Scholar 

  15. Guo C, Zhuang R, Su C, Liu CZ, Choo KKR (2019) Secure and efficient K nearest neighbor query over encrypted uncertain data in Cloud-IoT ecosystem. IEEE Internet Things J 6 (6):9868– 9879

    Article  Google Scholar 

  16. Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Min Knowl Disc 8(1):53–87

    Article  MathSciNet  Google Scholar 

  17. Leung CKS, Hao B (2009) Mining of frequent itemsets from streams of uncertain data. In: IEEE International conference on data engineering, pp 1663–1670

  18. Leung CKS, Mateo MAF, Brajczuk DA (2008) A tree-based approach for frequent pattern mining from uncertain data. In: The pacific-asia conference on knowledge discovery and data mining, pp 653–661

  19. Lin CW, Hong TP, Lu WH (2011) An effective tree structure for mining high utility itemsets. Expert Syst Appl 38(6):7419–7424

    Article  Google Scholar 

  20. Liu J, Wang K, Fung BCM (2012) Direct discovery of high utility itemsets without candidate generation. In: IEEE international conference on data mining, pp 984–989

  21. Lin CW, Hong TP (2012) A new mining approach for uncertain databases using cufp trees. Expert Syst Appl 39(4):4084–4093

    Article  Google Scholar 

  22. Liu C, Chen L, Zhang C (2013) Summarizing probabilistic frequent patterns: a fast approach. In: ACM SIGKDD international conference on knowledge discovery and data mining, pp 527–535

  23. Liu M, Qu J (2012) Mining high utility itemsets without candidate generation. In: ACM international conference on information and knowledge management, pp 55–64

  24. Liu Y, Liao WK, Choudhary A (2005) A two-phase algorithm for fast discovery of high utility itemsets. In: The pacific-asia conference on knowledge discovery and data mining, pp 689–695

  25. Lin YC, Wu CW, Tseng VS (2015) Mining high utility itemsets in big data. In: Pacific-asia conference on knowledge discovery and data mining, pp 659–661

  26. Lin JCW, Gan W, Fournier-Viger P, Hong TP, Tseng VS (2016) Efficient algorithms for mining high-utility itemsets in uncertain databases. Knowl-Based Syst 96:171–187

    Article  Google Scholar 

  27. Lin JCW, Gan W, Fournier-Viger P, Hong TP, Tseng VS (2017) Efficiently mining uncertain high-utility itemsets. Soft Comput 21:2801–2820

    Article  Google Scholar 

  28. Lin JCW, Srivastava G, Zhang Y, Djenouri Y, Aloqaily M (2020) Privacy preserving multi-objective sanitization model in 6G IoT environments. IEEE Internet of Things Journal

  29. Lin JCW, Shao Y, Djenouri Y, Yun U (2020) ASRNN: a recurrent neural network with an attention model for sequence labeling. Knowledge-Based Systems

  30. Sun L, Cheng R, Cheung DW, Cheng J (2010) Mining uncertain data with probabilistic guarantees. In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining 2010 Jul 25, pp 273–282

  31. Srivastava G, Lin JCW, Jolfaei A, Li Y, Djenouri Y (2020) Uncertain-driven analytics of sequence data in IoCV environments. IEEE Trans Intell Transp Syst, 1–12

  32. Srivastava G, Lin JCW, Pirouz M, Li Y, Yun U (2020) A pre-large weighted-fusion system of sensed high-utility patterns. IEEE Sensors Journal

  33. Tong Y, Chen L, Cheng Y, Yu PS (2012) Mining frequent itemsets over uncertain databases. VLDB Endowment 5(11):1650–1661

    Article  Google Scholar 

  34. Tseng VS, Wu CW, Shie BE, Yu PS (2010) UP-Growth: an efficient algorithm for high utility itemset mining. In: ACM SIGKDD international conference on knowledge discovery and data, pp 253–262

  35. Wang L, Cheng R, Lee SD, Cheung D (2010) Accelerating probabilistic frequent itemset mining: a model-based approach. In: ACM international conference on information and knowledge management, pp 429–438

  36. Wang L, Cheung DL, Cheng R, Lee SD, Yang XS (2012) Efficient mining of frequent item sets on large uncertain databases. IEEE Trans Knowl Data Eng 24(12):2170–2183

    Article  Google Scholar 

  37. Yao H, Hamilton HJ (2006) Mining itemset utilities from transaction databases. Data Knowl Eng 59(3):603–626

    Article  Google Scholar 

  38. Yao H, Hamilton HJ, Butz CJ (2004) A foundational approach to mining itemset utilities from databases. In: SIAM international conference on data mining, pp 211–225

  39. Zida S, Fournier-Viger P, Lin JCW, Wu CW, Tseng VS (2015) EFIM: a highly efficient algorithm for high-utility itemset mining. In: Mexican international conference on artificial intelligence, pp 530–546

  40. Zhang B, Lin JCW, Fournier-Viger P, Li T (2017) Mining of high utility-probability sequential patterns from uncertain databases. PloS One 12(7):e0180931

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jerry Chun-Wei Lin.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, J.MT., Srivastava, G., Lin, J.CW. et al. Mining of High-Utility Patterns in Big IoT-based Databases. Mobile Netw Appl 26, 216–233 (2021). https://doi.org/10.1007/s11036-020-01701-5

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11036-020-01701-5

Keywords

Navigation