High Utility Rare Itemset Mining over Transaction Databases

Goyal, Vikram; Dawar, Siddharth; Sureka, Ashish

doi:10.1007/978-3-319-16313-0_3

Vikram Goyal¹⁶,
Siddharth Dawar¹⁶ &
Ashish Sureka¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8999))

Included in the following conference series:

International Workshop on Databases in Networked Information Systems

1059 Accesses
8 Citations

Abstract

High-Utility Rare Itemset (HURI) mining finds itemsets from a database which have their utility no less than a given minimum utility threshold and have their support less than a given frequency threshold. Identifying high-utility rare itemsets from a database can help in better business decision making by highlighting the rare itemsets which give high profits so that they can be marketed more to earn good profit. Some two-phase algorithms have been proposed to mine high-utility rare itemsets. The rare itemsets are generated in the first phase and the high-utility rare itemsets are extracted from rare itemsets in the second phase. However, a two-phase solution is inefficient as the number of rare itemsets is enormous as they increase at a very fast rate with the increase in the frequency threshold. In this paper, we propose an algorithm, namely UP-Rare Growth, which uses UP-Tree data structure to find high-utility rare itemsets from a transaction database. Instead of finding the rare itemsets explicitly, our proposed algorithm works on both frequency and utility of itemsets together. We also propose a couple of effective strategies to avoid searching the non-useful branches of the tree. Extensive experiments show that our proposed algorithm outperforms the state-of-the-art algorithms in terms of number of candidates.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Pillai, J., Vyas, O.P., Muyeba, M.: Huri–a novel algorithm for mining high utility rare itemsets. In: Advances in Computing and Information Technology, pp. 531–540. Springer, Heidelberg (2013)
Google Scholar
Medici, F., Hawa, M.I., Giorgini, A.N.G.E.L.A., Panelo, A.R.A.C.E.L.I., Solfelix, C.M., Leslie, R.D., Pozzilli, P.: Antibodies to gad65 and a tyrosine phosphatase-like molecule ia-2ic in filipino type 1 diabetic patients. Diabetes Care 22(9), 1458–1461 (1999)
Article Google Scholar
Shi, W., Ngok, F.K., Zusman, D.R.: Cell density regulates cellular reversal frequency in myxococcus xanthus. Proceedings of the National Academy of Sciences 93(9), 4142–4146 (1996)
Article Google Scholar
Saha, B., Lazarescu, M., Venkatesh, S.: Infrequent item mining in multiple data streams. In: Seventh IEEE International Conference on Data Mining Workshops, ICDM Workshops 2007, pp. 569–574 (2007)
Google Scholar
Agrawal, R., Ramakrishnan, Srikant, o.: Fast algorithms for mining association rules. In: Proc. 20th int. conf. very large data bases, VLDB, vol. 1215, pp. 487–499 (1994)
Google Scholar
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: ACM SIGMOD, vol. 29, pp. 1–12. ACM (2000)
Google Scholar
Leung, C.K.-S., Khan, Q.I., Li, Z., Hoque, T.: Cantree: a canonical-order tree for incremental frequent-pattern mining. Knowledge and Information Systems 11(3), 287–311 (2007)
Article Google Scholar
Liu, Y., Liao, W.-k., Choudhary, A.: A fast high utility itemsets mining algorithm. In: International Workshop on Utility-Based Data Mining, pp. 90–99. ACM (2005)
Google Scholar
Tseng, V.S., Shie, B.-E., Wu, C.-W., Yu, P.S.: Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Transactions on Knowledge and Data Engineering 25(8), 1772–1786 (2013)
Article Google Scholar
Liu, Y., Liao, W.-k., Choudhary, A.K.: A two-phase algorithm for fast discovery of high utility itemsets. In: Ho, T.-B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 689–695. Springer, Heidelberg (2005)
Chapter Google Scholar
Shie, B.-E., Tseng, V.S., Yu, P.S.: Online mining of temporal maximal utility itemsets from data streams. In: ACM Symposium on Applied Computing, pp. 1622–1626. ACM (2010)
Google Scholar
Pillai, J., Vyas, O.P.: Transaction profitability using huri algorithm [tphuri]. International Journal of Business Information Systems 2(1) (2013)
Google Scholar
Tseng, V.S., Wu, C.-W., Shie, B.-E., Yu, P.S.: Up-growth: an efficient algorithm for high utility itemset mining. In: ACM SIGKDD, pp. 253–262. ACM (2010)
Google Scholar
Park, J.S., Chen, M.-S., Yu, P.S.: An effective hash-based algorithm for mining association rules, vol. 24. ACM (1995)
Google Scholar
Zaki, M.J., Parthasarathy, S., Ogihara, M., Wei, Li, o.: New algorithms for fast discovery of association rules. In: KDD, vol. 97, pp. 283–286 (1997)
Google Scholar
Ahmed, C.F., Tanbeer, S.K., Jeong, B.-S., Lee, Y.-K.: Efficient tree structures for high utility pattern mining in incremental databases. IEEE Transactions on Knowledge and Data Engineering 21(12), 1708–1721 (2009)
Article Google Scholar
Yeh, J.-S., Li, Y.-C., Chang, C.-C.: Two-Phase Algorithms for a Novel Utility-Frequent Mining Model. In: Washio, T., Zhou, Z.-H., Huang, J.Z., Hu, X., Li, J., Xie, C., He, J., Zou, D., Li, K.-C., Freire, M.M. (eds.) PAKDD 2007. LNCS (LNAI), vol. 4819, pp. 433–444. Springer, Heidelberg (2007)
Chapter Google Scholar
Koh, Y.S., Rountree, N.: Finding sporadic rules using apriori-inverse. In: Ho, T.-B., Cheung, D., Liu, H. (eds.) PAKDD 2005. LNCS (LNAI), vol. 3518, pp. 97–106. Springer, Heidelberg (2005)
Chapter Google Scholar
Szathmary, L., Napoli, A., Valtchev, P.: Towards rare itemset mining. In: 19th IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2007, vol. 1, pp. 305–312. IEEE (2007)
Google Scholar
Troiano, L., Scibelli, G., Birtolo, C.: A fast algorithm for mining rare itemsets. ISDA 9, 1149–1155 (2009)
Google Scholar
Goethals, B., Zaki, M.J.: The fimi repository (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

Indraprastha Institute of Information Technology-Delhi (IIIT-D), India
Vikram Goyal & Siddharth Dawar
Software Analytics Research Lab (SARL), India
Ashish Sureka

Authors

Vikram Goyal
View author publications
You can also search for this author in PubMed Google Scholar
Siddharth Dawar
View author publications
You can also search for this author in PubMed Google Scholar
Ashish Sureka
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Database Systems Laboratory, University of Aizu, 965-8580, Japan
Wanming Chu , Shinji Kikuchi & Subhash Bhalla , &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Goyal, V., Dawar, S., Sureka, A. (2015). High Utility Rare Itemset Mining over Transaction Databases. In: Chu, W., Kikuchi, S., Bhalla, S. (eds) Databases in Networked Information Systems. DNIS 2015. Lecture Notes in Computer Science, vol 8999. Springer, Cham. https://doi.org/10.1007/978-3-319-16313-0_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-16313-0_3
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16312-3
Online ISBN: 978-3-319-16313-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics