mHUIMiner: A Fast High Utility Itemset Mining Algorithm for Sparse Datasets

Peng, Alex Yuxuan; Koh, Yun Sing; Riddle, Patricia

doi:10.1007/978-3-319-57529-2_16

Alex Yuxuan Peng¹⁹,
Yun Sing Koh¹⁹ &
Patricia Riddle¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10235))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

3188 Accesses
31 Citations

Abstract

High utility itemset mining is the problem of finding sets of items whose utilities are higher than or equal to a specific threshold. We propose a novel technique called mHUIMiner, which utilises a tree structure to guide the itemset expansion process to avoid considering itemsets that are nonexistent in the database. Unlike current techniques, it does not have a complex pruning strategy that requires expensive computation overhead. Extensive experiments have been done to compare mHUIMiner to other state-of-the-art algorithms. The experimental results show that our technique outperforms the state-of-the-art algorithms in terms of running time for sparse datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. In: ACM SIGMOD Record, vol. 22, pp. 207–216. ACM (1993)
Google Scholar
Ahmed, C.F., Tanbeer, S.K., Jeong, B.S., Lee, Y.K.: Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans. Knowl. Data Eng. 21(12), 1708–1721 (2009)
Article Google Scholar
Fournier-Viger, P., Wu, C.-W., Zida, S., Tseng, V.S.: FHM: faster high-utility itemset mining using estimated utility co-occurrence pruning. In: Andreasen, T., Christiansen, H., Cubero, J.-C., Raś, Z.W. (eds.) ISMIS 2014. LNCS (LNAI), vol. 8502, pp. 83–92. Springer, Cham (2014). doi:10.1007/978-3-319-08326-1_9
Google Scholar
Han, J., Cheng, H., Xin, D., Yan, X.: Frequent pattern mining: current status and future directions. Data Min. Knowl. Disc. 15(1), 55–86 (2007)
Article MathSciNet Google Scholar
Liu, M., Qu, J.: Mining high utility itemsets without candidate generation. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, pp. 55–64. ACM (2012)
Google Scholar
Liu, Y., Liao, W.k., Choudhary, A.: A fast high utility itemsets mining algorithm. In: Proceedings of the 1st International Workshop on Utility-Based Data Mining, pp. 90–99. ACM (2005)
Google Scholar
Liu, Y., Liao, W.k., Choudhary, A.: A two-phase algorithm for fast discovery of high utility itemsets. In: Ho, T.B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 689–695. Springer, Heidelberg (2005). doi:10.1007/11430919_79
Tseng, V.S., Shie, B.E., Wu, C.W., Philip, S.Y.: Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans. Knowl. Data Eng. 25(8), 1772–1786 (2013)
Article Google Scholar
Yao, H., Hamilton, H.J., Butz, C.J.: A foundational approach to mining itemset utilities from databases. In: SDM, vol. 4, pp. 215–221. SIAM (2004)
Google Scholar
Zida, S., Fournier-Viger, P., Lin, J.C.-W., Wu, C.-W., Tseng, V.S.: EFIM: a highly efficient algorithm for high-utility itemset mining. In: Sidorov, G., Galicia-Haro, S.N. (eds.) MICAI 2015. LNCS (LNAI), vol. 9413, pp. 530–546. Springer, Cham (2015). doi:10.1007/978-3-319-27060-9_44
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

The University of Auckland, Auckland, New Zealand
Alex Yuxuan Peng, Yun Sing Koh & Patricia Riddle

Authors

Alex Yuxuan Peng
View author publications
You can also search for this author in PubMed Google Scholar
Yun Sing Koh
View author publications
You can also search for this author in PubMed Google Scholar
Patricia Riddle
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yun Sing Koh .

Editor information

Editors and Affiliations

Kangwon National University, Chuncheon, Korea (Republic of)
Jinho Kim
Seoul National University, Seoul, Korea (Republic of)
Kyuseok Shim
University of Technology Sydney, Sydney, New South Wales, Australia
Longbing Cao
KAIST, Daejeon, Korea (Republic of)
Jae-Gil Lee
University of New South Wales, Sydney, New South Wales, Australia
Xuemin Lin
Kangwon National University, Chuncheon, Korea (Republic of)
Yang-Sae Moon

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Peng, A.Y., Koh, Y.S., Riddle, P. (2017). mHUIMiner: A Fast High Utility Itemset Mining Algorithm for Sparse Datasets. In: Kim, J., Shim, K., Cao, L., Lee, JG., Lin, X., Moon, YS. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2017. Lecture Notes in Computer Science(), vol 10235. Springer, Cham. https://doi.org/10.1007/978-3-319-57529-2_16

Download citation

DOI: https://doi.org/10.1007/978-3-319-57529-2_16
Published: 23 April 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-57528-5
Online ISBN: 978-3-319-57529-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics