Parallel High Utility Itemset Mining

Fan, Gaojuan; Xiao, Huaiyuan; Zhang, Chongsheng; Almpanidis, George; Fournier-Viger, Philippe; Fujita, Hamido

doi:10.1007/978-3-031-08530-7_69

Gaojuan Fan¹¹,
Huaiyuan Xiao^11,12,
Chongsheng Zhang¹¹,
George Almpanidis¹¹,
Philippe Fournier-Viger¹³ &
…
Hamido Fujita¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13343))

Included in the following conference series:

International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems

1582 Accesses

Abstract

Association rule mining is a popular data mining task for finding relationships between values from the itemsets that co-occur frequently in a transactional database. Association rule mining has many applications but the “support-confidence” framework it depends on is inadequate for many cases. In recent years, a generalised task called high utility itemset mining (HUIM) has gained much popularity; it aims at discovering itemsets that yield a high revenue as measured by a utility function. However, when facing large data volumes, the running time of state-of-the-art HUIM algorithms often grows exponentially. In this work, we investigate parallel HUIM algorithms (PHUIM) and adapt two state-of-the-art sequential HUIM algorithms for parallel processing based on the Apache Spark in-memory data processing platform. Extensive experiments on several benchmark and synthetic datasets show that the proposed methods improve considerably the efficiency of the baseline HUIM algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, pp. 207–216 (1993)
Google Scholar
Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.I., et al.: Fast discovery of association rules. Adv. Knowl. Discov. Data Min. 12(1), 307–328 (1996)
Google Scholar
Aryabarzan, N., Minaei-Bidgoli, B., Teshnehlab, M.: negFIN: an efficient algorithm for fast mining frequent itemsets. Expert Syst. Appl. 105, 129–143 (2018)
Article Google Scholar
Fournier-Viger, P., Chun-Wei Lin, J., Truong-Chi, T., Nkambou, R.: A survey of high utility itemset mining. In: Fournier-Viger, P., Lin, J.C.-W., Nkambou, R., Vo, B., Tseng, V.S. (eds.) High-Utility Pattern Mining. SBD, vol. 51, pp. 1–45. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-04921-8_1
Chapter Google Scholar
Fournier-Viger, P., Lin, J.C.-W., Duong, Q.-H., Dam, T.-L.: FHM\(+\): faster high-utility itemset mining using length upper-bound reduction. In: Fujita, H., Ali, M., Selamat, A., Sasaki, J., Kurematsu, M. (eds.) IEA/AIE 2016. LNCS (LNAI), vol. 9799, pp. 115–127. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-42007-3_11
Chapter Google Scholar
Fournier-Viger, P., et al.: The SPMF open-source data mining library version 2. In: Berendt, B., et al. (eds.) ECML PKDD 2016. LNCS (LNAI), vol. 9853, pp. 36–40. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46131-1_8
Chapter Google Scholar
Han, J., Pei, J., Yin, Y., Mao, R.: Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min. Knowl. Disc. 8(1), 53–87 (2004)
Article MathSciNet Google Scholar
Krishnamoorthy, S.: Pruning strategies for mining high utility itemsets. Expert Syst. Appl. 42(5), 2371–2381 (2015)
Article Google Scholar
Lin, Y.C., Wu, C.-W., Tseng, V.S.: Mining high utility itemsets in big data. In: Cao, T., Lim, E.-P., Zhou, Z.-H., Ho, T.-B., Cheung, D., Motoda, H. (eds.) PAKDD 2015. LNCS (LNAI), vol. 9078, pp. 649–661. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-18032-8_51
Chapter Google Scholar
Liu, J., Wang, K., Fung, B.C.: Direct discovery of high utility itemsets without candidate generation. In: 2012 IEEE 12th International Conference on Data Mining, pp. 984–989 (2012)
Google Scholar
Liu, M., Qu, J.: Mining high utility itemsets without candidate generation. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, pp. 55–64 (2012)
Google Scholar
Luna, J.M., Fournier-Viger, P., Ventura, S.: Frequent itemset mining: a 25 years review. Wiley Interdisc. Rev. Data Min. Knowl. Discov. 9(6), e1329 (2019)
Article Google Scholar
Nawaz, M.S., Fournier-Viger, P., Yun, U., Wu, Y., Song, W.: Mining high utility itemsets with hill climbing and simulated annealing. ACM Trans. Manage. Inf. Syst. (TMIS) 13(1), 1–22 (2021)
Google Scholar
Nawaz, M.S., Fournier-Viger, P., Zhang, J.: Proof learning in PVS with utility pattern mining. IEEE Access 8, 119806–119818 (2020)
Article Google Scholar
Pramanik, S., Goswami, A.: Discovery of closed high utility itemsets using a fast nature-inspired ant colony algorithm. Appl. Intell. 52, 8839–8855 (2021). https://doi.org/10.1007/s10489-021-02922-1
Article Google Scholar
Sethi, K.K., Ramesh, D., Edla, D.R.: P-fhm+: parallel high utility itemset mining algorithm for big data processing. Procedia Comput. Sci. 132, 918–927 (2018)
Article Google Scholar
Tseng, V.S., Wu, C.W., Shie, B.E., Yu, P.S.: Up-growth: an efficient algorithm for high utility itemset mining. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 253–262 (2010)
Google Scholar
Zaki, M.J.: Hierarchical parallel algorithms for association mining. In: Advances in Distributed and Parallel Knowledge Discovery, pp. 339–376 (2000)
Google Scholar
Zhang, C., Almpanidis, G., Wang, W., Liu, C.: An empirical evaluation of high utility itemset mining algorithms. Expert Syst. Appl. 101, 91–115 (2018)
Article Google Scholar
Zida, S., Fournier-Viger, P., Lin, J.C.W., Wu, C.W., Tseng, V.S.: Efim: a fast and memory efficient algorithm for high-utility itemset mining. Knowl. Inf. Syst. 51(2), 595–625 (2017)
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer and Information Engineering, Henan University, Kaifeng, China
Gaojuan Fan, Huaiyuan Xiao, Chongsheng Zhang & George Almpanidis
Bank of Zhengzhou Co., Ltd., Zhengzhou, China
Huaiyuan Xiao
College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China
Philippe Fournier-Viger
Faculty of Software and Information Science, Iwate Prefectural University, Takizawa, Japan
Hamido Fujita

Authors

Gaojuan Fan
View author publications
You can also search for this author in PubMed Google Scholar
Huaiyuan Xiao
View author publications
You can also search for this author in PubMed Google Scholar
Chongsheng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
George Almpanidis
View author publications
You can also search for this author in PubMed Google Scholar
Philippe Fournier-Viger
View author publications
You can also search for this author in PubMed Google Scholar
Hamido Fujita
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to George Almpanidis .

Editor information

Editors and Affiliations

i-SOMET, Inc., Morioka-shi, Iwate, Japan
Hamido Fujita
College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, Guangdong, China
Philippe Fournier-Viger
Texas State University, San Marcos, TX, USA
Moonis Ali
Shanghai University of Finance and Economics, Shanghai, China
Yinglin Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fan, G., Xiao, H., Zhang, C., Almpanidis, G., Fournier-Viger, P., Fujita, H. (2022). Parallel High Utility Itemset Mining. In: Fujita, H., Fournier-Viger, P., Ali, M., Wang, Y. (eds) Advances and Trends in Artificial Intelligence. Theory and Practices in Artificial Intelligence. IEA/AIE 2022. Lecture Notes in Computer Science(), vol 13343. Springer, Cham. https://doi.org/10.1007/978-3-031-08530-7_69

Download citation

DOI: https://doi.org/10.1007/978-3-031-08530-7_69
Published: 30 August 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-08529-1
Online ISBN: 978-3-031-08530-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics