Skip to main content

Abstract

Association rule mining is a popular data mining task for finding relationships between values from the itemsets that co-occur frequently in a transactional database. Association rule mining has many applications but the “support-confidence” framework it depends on is inadequate for many cases. In recent years, a generalised task called high utility itemset mining (HUIM) has gained much popularity; it aims at discovering itemsets that yield a high revenue as measured by a utility function. However, when facing large data volumes, the running time of state-of-the-art HUIM algorithms often grows exponentially. In this work, we investigate parallel HUIM algorithms (PHUIM) and adapt two state-of-the-art sequential HUIM algorithms for parallel processing based on the Apache Spark in-memory data processing platform. Extensive experiments on several benchmark and synthetic datasets show that the proposed methods improve considerably the efficiency of the baseline HUIM algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, pp. 207–216 (1993)

    Google Scholar 

  2. Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.I., et al.: Fast discovery of association rules. Adv. Knowl. Discov. Data Min. 12(1), 307–328 (1996)

    Google Scholar 

  3. Aryabarzan, N., Minaei-Bidgoli, B., Teshnehlab, M.: negFIN: an efficient algorithm for fast mining frequent itemsets. Expert Syst. Appl. 105, 129–143 (2018)

    Article  Google Scholar 

  4. Fournier-Viger, P., Chun-Wei Lin, J., Truong-Chi, T., Nkambou, R.: A survey of high utility itemset mining. In: Fournier-Viger, P., Lin, J.C.-W., Nkambou, R., Vo, B., Tseng, V.S. (eds.) High-Utility Pattern Mining. SBD, vol. 51, pp. 1–45. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-04921-8_1

    Chapter  Google Scholar 

  5. Fournier-Viger, P., Lin, J.C.-W., Duong, Q.-H., Dam, T.-L.: FHM\(+\): faster high-utility itemset mining using length upper-bound reduction. In: Fujita, H., Ali, M., Selamat, A., Sasaki, J., Kurematsu, M. (eds.) IEA/AIE 2016. LNCS (LNAI), vol. 9799, pp. 115–127. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-42007-3_11

    Chapter  Google Scholar 

  6. Fournier-Viger, P., et al.: The SPMF open-source data mining library version 2. In: Berendt, B., et al. (eds.) ECML PKDD 2016. LNCS (LNAI), vol. 9853, pp. 36–40. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46131-1_8

    Chapter  Google Scholar 

  7. Han, J., Pei, J., Yin, Y., Mao, R.: Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min. Knowl. Disc. 8(1), 53–87 (2004)

    Article  MathSciNet  Google Scholar 

  8. Krishnamoorthy, S.: Pruning strategies for mining high utility itemsets. Expert Syst. Appl. 42(5), 2371–2381 (2015)

    Article  Google Scholar 

  9. Lin, Y.C., Wu, C.-W., Tseng, V.S.: Mining high utility itemsets in big data. In: Cao, T., Lim, E.-P., Zhou, Z.-H., Ho, T.-B., Cheung, D., Motoda, H. (eds.) PAKDD 2015. LNCS (LNAI), vol. 9078, pp. 649–661. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-18032-8_51

    Chapter  Google Scholar 

  10. Liu, J., Wang, K., Fung, B.C.: Direct discovery of high utility itemsets without candidate generation. In: 2012 IEEE 12th International Conference on Data Mining, pp. 984–989 (2012)

    Google Scholar 

  11. Liu, M., Qu, J.: Mining high utility itemsets without candidate generation. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, pp. 55–64 (2012)

    Google Scholar 

  12. Luna, J.M., Fournier-Viger, P., Ventura, S.: Frequent itemset mining: a 25 years review. Wiley Interdisc. Rev. Data Min. Knowl. Discov. 9(6), e1329 (2019)

    Article  Google Scholar 

  13. Nawaz, M.S., Fournier-Viger, P., Yun, U., Wu, Y., Song, W.: Mining high utility itemsets with hill climbing and simulated annealing. ACM Trans. Manage. Inf. Syst. (TMIS) 13(1), 1–22 (2021)

    Google Scholar 

  14. Nawaz, M.S., Fournier-Viger, P., Zhang, J.: Proof learning in PVS with utility pattern mining. IEEE Access 8, 119806–119818 (2020)

    Article  Google Scholar 

  15. Pramanik, S., Goswami, A.: Discovery of closed high utility itemsets using a fast nature-inspired ant colony algorithm. Appl. Intell. 52, 8839–8855 (2021). https://doi.org/10.1007/s10489-021-02922-1

    Article  Google Scholar 

  16. Sethi, K.K., Ramesh, D., Edla, D.R.: P-fhm+: parallel high utility itemset mining algorithm for big data processing. Procedia Comput. Sci. 132, 918–927 (2018)

    Article  Google Scholar 

  17. Tseng, V.S., Wu, C.W., Shie, B.E., Yu, P.S.: Up-growth: an efficient algorithm for high utility itemset mining. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 253–262 (2010)

    Google Scholar 

  18. Zaki, M.J.: Hierarchical parallel algorithms for association mining. In: Advances in Distributed and Parallel Knowledge Discovery, pp. 339–376 (2000)

    Google Scholar 

  19. Zhang, C., Almpanidis, G., Wang, W., Liu, C.: An empirical evaluation of high utility itemset mining algorithms. Expert Syst. Appl. 101, 91–115 (2018)

    Article  Google Scholar 

  20. Zida, S., Fournier-Viger, P., Lin, J.C.W., Wu, C.W., Tseng, V.S.: Efim: a fast and memory efficient algorithm for high-utility itemset mining. Knowl. Inf. Syst. 51(2), 595–625 (2017)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to George Almpanidis .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Fan, G., Xiao, H., Zhang, C., Almpanidis, G., Fournier-Viger, P., Fujita, H. (2022). Parallel High Utility Itemset Mining. In: Fujita, H., Fournier-Viger, P., Ali, M., Wang, Y. (eds) Advances and Trends in Artificial Intelligence. Theory and Practices in Artificial Intelligence. IEA/AIE 2022. Lecture Notes in Computer Science(), vol 13343. Springer, Cham. https://doi.org/10.1007/978-3-031-08530-7_69

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-08530-7_69

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-08529-1

  • Online ISBN: 978-3-031-08530-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics