Abstract
In recent day’s social media, smartphones, mobile apps, and the internet of things generate huge amounts of data every second. This data is structured, unstructured, or semi-structured and available in various formats. Therefore, traditional approaches are not sufficient to handle such kind of data effectively. On the other hand, high utility based frequent patterns are also essential to make effective decisions, and its demand is also increasing in recent years. Many strategies have been proposed for high utility based pattern generation. But these methods are limited to data size and operate on standalone systems. To address this issue, we have proposed a parallel procedure for high utility based frequent pattern mining. The proposed approach can handle huge amounts of data, i.e., big data. It can be used in various real-life applications and benefited citizens. The proposed technique is implemented in cluster-node architecture using Spark. The results are validated on various real-time data sets and found that the proposed strategy is more efficient than traditional methods in terms of execution time. The results of the proposed work will provide solutions to real-time issues such as health care, education, transport, and so on.
References
Liu, Y., Liao, W.K., Choudhary, A.: A two-phase algorithm for fast discovery of high utility itemsets. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 689–695. Springer, Berlin (2005)
Kumar, S., Mohbey, K.K.: A review on big data parallel and distributed approaches of pattern mining. J. King Saud Univ. Comput. Inform. Sci. (2019). https://doi.org/10.1016/j.jksuci.2019.09.006
Yao, H., Hamilton, H.J., Butz, C.J.: A foundational approach to mining itemset utilities from databases. In: Proceedings of the 2004 SIAM International Conference on Data Mining. Society for Industrial and Applied Mathematics, pp. 482–486 (2004)
Yao, H., Hamilton, H.J.: Mining itemset utilities from transaction databases. Data Knowl. Eng. 59(3), 603–626 (2006)
Ahmed, C.F., Tanbeer, S.K., Jeong, B.S., Lee, Y.K.: Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans. Knowl. Data Eng. 21(12), 1708–1721 (2009)
Le, B., Nguyen, H., Cao, T.A., Vo, B.: A novel algorithm for mining high utility itemsets. In: 2009 First Asian Conference on Intelligent Information and Database Systems, IEEE, pp. 13–17 (2009)
Tseng, V.S., Shie, B.E., Wu, C.W., Philip, S.Y.: Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans. Knowl. Data Eng. 25(8), 1772–1786 (2012)
Liu, M., Qu, J.: Mining high utility itemsets without candidate generation. In: Proceedings of the 21st ACM international conference on Information and knowledge management, pp. 55–64 (2012)
Zida, S., Fournier-Viger, P., Lin, J.C.W., Wu, C.W., Tseng, V.S.: EFIM: a fast and memory efficient algorithm for high-utility itemset mining. Knowl. Inf. Syst. 51(2), 595–625 (2017)
Tseng, V.S., Wu, C.W., Fournier-Viger, P., Philip, S.Y.: Efficient algorithms for mining the concise and lossless representation of high utility itemsets. IEEE Trans. Knowl. Data Eng. 27(3), 726–739 (2014)
Ryang, H., Yun, U.: Top-k high utility pattern mining with effective threshold raising strategies. Knowl. Based Syst. 76, 109–126 (2015)
Tseng, V.S., Wu, C.W., Fournier-Viger, P., Philip, S.Y.: Efficient algorithms for mining top-k high utility itemsets. IEEE Trans. Knowl. Data Eng. 28(1), 54–67 (2015)
Lin, J.C.W., Gan, W., Fournier-Viger, P., Hong, T.P., Tseng, V.S.: Efficient algorithms for mining high-utility itemsets in uncertain databases. Knowl. Based Syst. 96, 171–187 (2016)
Chen, Y., An, A.: Approximate parallel high utility itemset mining. Big Data Res. 6, 26–42 (2016)
Yun, U., Nam, H., Lee, G., Yoon, E.: Efficient approach for incremental high utility pattern mining with indexed list structure. Future Gener. Comput. Syst. 95, 221–239 (2019)
Gan, W., Lin, J.C.W., Fournier-Viger, P., Chao, H.C., Philip, S.Y.: HUOPM: High-utility occupancy pattern mining. IEEE Trans. Cybern. 50(3), 1195–1208 (2019)
Fournier-Viger, P., Zhang, Y., Lin, J.C.W., Fujita, H., Koh, Y.S.: Mining local and peak high utility itemsets. Inf. Sci. 481, 344–367 (2019)
Wu, J.M.T., Lin, J.C.W., Tamrakar, A.: High-utility itemset mining with effective pruning strategies. ACM Trans. Knowl. Discov. Data (TKDD) 13(6), 1–22 (2019)
Djenouri, Y., Lin, J.C.W., Nørvåg, K., Ramampiaro, H.: Highly efficient pattern mining based on transaction decomposition. In: 2019 IEEE 35th International Conference on Data Engineering (ICDE), IEEE, pp. 1646–1649 (2019)
Lin, Y.C., Wu, C.W., Tseng, V.S.: Mining high utility itemsets in big data. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 649–661. Springer, Cham (2015)
Sethi, K.K., Ramesh, D., Edla, D.R.: P-FHM+: Parallel high utility itemset mining algorithm for big data processing. Proced. Comput. Sci. 132, 918–927 (2018)
Fournier-Viger, P., Gomariz, A., Gueniche, T., Soltani, A., Wu, C.W., Tseng, V.S.: SPMF: a Java open-source pattern mining library. J. Mach. Learn. Research 15(1), 3389–3393 (2014)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Mohbey, K.K., Kumar, S. A parallel approach for high utility-based frequent pattern mining in a big data environment. Iran J Comput Sci 4, 195–200 (2021). https://doi.org/10.1007/s42044-021-00083-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42044-021-00083-5