Skip to main content
Log in

A parallel approach for high utility-based frequent pattern mining in a big data environment

  • Short Communication
  • Published:
Iran Journal of Computer Science Aims and scope Submit manuscript

Abstract

In recent day’s social media, smartphones, mobile apps, and the internet of things generate huge amounts of data every second. This data is structured, unstructured, or semi-structured and available in various formats. Therefore, traditional approaches are not sufficient to handle such kind of data effectively. On the other hand, high utility based frequent patterns are also essential to make effective decisions, and its demand is also increasing in recent years. Many strategies have been proposed for high utility based pattern generation. But these methods are limited to data size and operate on standalone systems. To address this issue, we have proposed a parallel procedure for high utility based frequent pattern mining. The proposed approach can handle huge amounts of data, i.e., big data. It can be used in various real-life applications and benefited citizens. The proposed technique is implemented in cluster-node architecture using Spark. The results are validated on various real-time data sets and found that the proposed strategy is more efficient than traditional methods in terms of execution time. The results of the proposed work will provide solutions to real-time issues such as health care, education, transport, and so on.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Notes

  1. http://www.philippe-fournier-viger.com/spmf/.

References

  1. Liu, Y., Liao, W.K., Choudhary, A.: A two-phase algorithm for fast discovery of high utility itemsets. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 689–695. Springer, Berlin (2005)

    Chapter  Google Scholar 

  2. Kumar, S., Mohbey, K.K.: A review on big data parallel and distributed approaches of pattern mining. J. King Saud Univ. Comput. Inform. Sci. (2019). https://doi.org/10.1016/j.jksuci.2019.09.006

    Article  Google Scholar 

  3. Yao, H., Hamilton, H.J., Butz, C.J.: A foundational approach to mining itemset utilities from databases. In: Proceedings of the 2004 SIAM International Conference on Data Mining. Society for Industrial and Applied Mathematics, pp. 482–486 (2004)

  4. Yao, H., Hamilton, H.J.: Mining itemset utilities from transaction databases. Data Knowl. Eng. 59(3), 603–626 (2006)

    Article  Google Scholar 

  5. Ahmed, C.F., Tanbeer, S.K., Jeong, B.S., Lee, Y.K.: Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans. Knowl. Data Eng. 21(12), 1708–1721 (2009)

    Article  Google Scholar 

  6. Le, B., Nguyen, H., Cao, T.A., Vo, B.: A novel algorithm for mining high utility itemsets. In: 2009 First Asian Conference on Intelligent Information and Database Systems, IEEE, pp. 13–17 (2009)

  7. Tseng, V.S., Shie, B.E., Wu, C.W., Philip, S.Y.: Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans. Knowl. Data Eng. 25(8), 1772–1786 (2012)

    Article  Google Scholar 

  8. Liu, M., Qu, J.: Mining high utility itemsets without candidate generation. In: Proceedings of the 21st ACM international conference on Information and knowledge management, pp. 55–64 (2012)

  9. Zida, S., Fournier-Viger, P., Lin, J.C.W., Wu, C.W., Tseng, V.S.: EFIM: a fast and memory efficient algorithm for high-utility itemset mining. Knowl. Inf. Syst. 51(2), 595–625 (2017)

    Article  Google Scholar 

  10. Tseng, V.S., Wu, C.W., Fournier-Viger, P., Philip, S.Y.: Efficient algorithms for mining the concise and lossless representation of high utility itemsets. IEEE Trans. Knowl. Data Eng. 27(3), 726–739 (2014)

    Article  Google Scholar 

  11. Ryang, H., Yun, U.: Top-k high utility pattern mining with effective threshold raising strategies. Knowl. Based Syst. 76, 109–126 (2015)

    Article  Google Scholar 

  12. Tseng, V.S., Wu, C.W., Fournier-Viger, P., Philip, S.Y.: Efficient algorithms for mining top-k high utility itemsets. IEEE Trans. Knowl. Data Eng. 28(1), 54–67 (2015)

    Article  Google Scholar 

  13. Lin, J.C.W., Gan, W., Fournier-Viger, P., Hong, T.P., Tseng, V.S.: Efficient algorithms for mining high-utility itemsets in uncertain databases. Knowl. Based Syst. 96, 171–187 (2016)

    Article  Google Scholar 

  14. Chen, Y., An, A.: Approximate parallel high utility itemset mining. Big Data Res. 6, 26–42 (2016)

    Article  Google Scholar 

  15. Yun, U., Nam, H., Lee, G., Yoon, E.: Efficient approach for incremental high utility pattern mining with indexed list structure. Future Gener. Comput. Syst. 95, 221–239 (2019)

    Article  Google Scholar 

  16. Gan, W., Lin, J.C.W., Fournier-Viger, P., Chao, H.C., Philip, S.Y.: HUOPM: High-utility occupancy pattern mining. IEEE Trans. Cybern. 50(3), 1195–1208 (2019)

    Article  Google Scholar 

  17. Fournier-Viger, P., Zhang, Y., Lin, J.C.W., Fujita, H., Koh, Y.S.: Mining local and peak high utility itemsets. Inf. Sci. 481, 344–367 (2019)

    Article  MathSciNet  Google Scholar 

  18. Wu, J.M.T., Lin, J.C.W., Tamrakar, A.: High-utility itemset mining with effective pruning strategies. ACM Trans. Knowl. Discov. Data (TKDD) 13(6), 1–22 (2019)

    Article  Google Scholar 

  19. Djenouri, Y., Lin, J.C.W., Nørvåg, K., Ramampiaro, H.: Highly efficient pattern mining based on transaction decomposition. In: 2019 IEEE 35th International Conference on Data Engineering (ICDE), IEEE, pp. 1646–1649 (2019)

  20. Lin, Y.C., Wu, C.W., Tseng, V.S.: Mining high utility itemsets in big data. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 649–661. Springer, Cham (2015)

    Chapter  Google Scholar 

  21. Sethi, K.K., Ramesh, D., Edla, D.R.: P-FHM+: Parallel high utility itemset mining algorithm for big data processing. Proced. Comput. Sci. 132, 918–927 (2018)

    Article  Google Scholar 

  22. Fournier-Viger, P., Gomariz, A., Gueniche, T., Soltani, A., Wu, C.W., Tseng, V.S.: SPMF: a Java open-source pattern mining library. J. Mach. Learn. Research 15(1), 3389–3393 (2014)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Krishna Kumar Mohbey.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mohbey, K.K., Kumar, S. A parallel approach for high utility-based frequent pattern mining in a big data environment. Iran J Comput Sci 4, 195–200 (2021). https://doi.org/10.1007/s42044-021-00083-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s42044-021-00083-5

Keywords

Navigation