Abstract
Association rule mining is a popular data mining task for finding relationships between values from the itemsets that co-occur frequently in a transactional database. Association rule mining has many applications but the “support-confidence” framework it depends on is inadequate for many cases. In recent years, a generalised task called high utility itemset mining (HUIM) has gained much popularity; it aims at discovering itemsets that yield a high revenue as measured by a utility function. However, when facing large data volumes, the running time of state-of-the-art HUIM algorithms often grows exponentially. In this work, we investigate parallel HUIM algorithms (PHUIM) and adapt two state-of-the-art sequential HUIM algorithms for parallel processing based on the Apache Spark in-memory data processing platform. Extensive experiments on several benchmark and synthetic datasets show that the proposed methods improve considerably the efficiency of the baseline HUIM algorithms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, pp. 207–216 (1993)
Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.I., et al.: Fast discovery of association rules. Adv. Knowl. Discov. Data Min. 12(1), 307–328 (1996)
Aryabarzan, N., Minaei-Bidgoli, B., Teshnehlab, M.: negFIN: an efficient algorithm for fast mining frequent itemsets. Expert Syst. Appl. 105, 129–143 (2018)
Fournier-Viger, P., Chun-Wei Lin, J., Truong-Chi, T., Nkambou, R.: A survey of high utility itemset mining. In: Fournier-Viger, P., Lin, J.C.-W., Nkambou, R., Vo, B., Tseng, V.S. (eds.) High-Utility Pattern Mining. SBD, vol. 51, pp. 1–45. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-04921-8_1
Fournier-Viger, P., Lin, J.C.-W., Duong, Q.-H., Dam, T.-L.: FHM\(+\): faster high-utility itemset mining using length upper-bound reduction. In: Fujita, H., Ali, M., Selamat, A., Sasaki, J., Kurematsu, M. (eds.) IEA/AIE 2016. LNCS (LNAI), vol. 9799, pp. 115–127. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-42007-3_11
Fournier-Viger, P., et al.: The SPMF open-source data mining library version 2. In: Berendt, B., et al. (eds.) ECML PKDD 2016. LNCS (LNAI), vol. 9853, pp. 36–40. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46131-1_8
Han, J., Pei, J., Yin, Y., Mao, R.: Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min. Knowl. Disc. 8(1), 53–87 (2004)
Krishnamoorthy, S.: Pruning strategies for mining high utility itemsets. Expert Syst. Appl. 42(5), 2371–2381 (2015)
Lin, Y.C., Wu, C.-W., Tseng, V.S.: Mining high utility itemsets in big data. In: Cao, T., Lim, E.-P., Zhou, Z.-H., Ho, T.-B., Cheung, D., Motoda, H. (eds.) PAKDD 2015. LNCS (LNAI), vol. 9078, pp. 649–661. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-18032-8_51
Liu, J., Wang, K., Fung, B.C.: Direct discovery of high utility itemsets without candidate generation. In: 2012 IEEE 12th International Conference on Data Mining, pp. 984–989 (2012)
Liu, M., Qu, J.: Mining high utility itemsets without candidate generation. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, pp. 55–64 (2012)
Luna, J.M., Fournier-Viger, P., Ventura, S.: Frequent itemset mining: a 25 years review. Wiley Interdisc. Rev. Data Min. Knowl. Discov. 9(6), e1329 (2019)
Nawaz, M.S., Fournier-Viger, P., Yun, U., Wu, Y., Song, W.: Mining high utility itemsets with hill climbing and simulated annealing. ACM Trans. Manage. Inf. Syst. (TMIS) 13(1), 1–22 (2021)
Nawaz, M.S., Fournier-Viger, P., Zhang, J.: Proof learning in PVS with utility pattern mining. IEEE Access 8, 119806–119818 (2020)
Pramanik, S., Goswami, A.: Discovery of closed high utility itemsets using a fast nature-inspired ant colony algorithm. Appl. Intell. 52, 8839–8855 (2021). https://doi.org/10.1007/s10489-021-02922-1
Sethi, K.K., Ramesh, D., Edla, D.R.: P-fhm+: parallel high utility itemset mining algorithm for big data processing. Procedia Comput. Sci. 132, 918–927 (2018)
Tseng, V.S., Wu, C.W., Shie, B.E., Yu, P.S.: Up-growth: an efficient algorithm for high utility itemset mining. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 253–262 (2010)
Zaki, M.J.: Hierarchical parallel algorithms for association mining. In: Advances in Distributed and Parallel Knowledge Discovery, pp. 339–376 (2000)
Zhang, C., Almpanidis, G., Wang, W., Liu, C.: An empirical evaluation of high utility itemset mining algorithms. Expert Syst. Appl. 101, 91–115 (2018)
Zida, S., Fournier-Viger, P., Lin, J.C.W., Wu, C.W., Tseng, V.S.: Efim: a fast and memory efficient algorithm for high-utility itemset mining. Knowl. Inf. Syst. 51(2), 595–625 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Fan, G., Xiao, H., Zhang, C., Almpanidis, G., Fournier-Viger, P., Fujita, H. (2022). Parallel High Utility Itemset Mining. In: Fujita, H., Fournier-Viger, P., Ali, M., Wang, Y. (eds) Advances and Trends in Artificial Intelligence. Theory and Practices in Artificial Intelligence. IEA/AIE 2022. Lecture Notes in Computer Science(), vol 13343. Springer, Cham. https://doi.org/10.1007/978-3-031-08530-7_69
Download citation
DOI: https://doi.org/10.1007/978-3-031-08530-7_69
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-08529-1
Online ISBN: 978-3-031-08530-7
eBook Packages: Computer ScienceComputer Science (R0)