A New FP-Tree Algorithm for Mining Frequent Itemsets

Li, Yu-Chiang; Chang, Chin-Chen

doi:10.1007/978-3-540-30483-8_32

Yu-Chiang Li¹⁸ &
Chin-Chen Chang¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3309))

Included in the following conference series:

Advanced Workshop on Content Computing

517 Accesses
10 Citations

Abstract

Data mining has become an important field and has been applied extensively across many areas. Mining frequent itemsets in a transaction database is critical for mining association rules. Many investigations have estabilished that pattern-growth method outperforms the method of Apriori-like candidate generation. The performance of the pattern-growth method depends on the number of tree nodes. Accordingly, this work presents a new FP-tree structure (NFP-tree) and develops an efficient approach for mining frequent itemsets, based on an NFP-tree, called the NFP-growth approach. NFP-tree employs two counters in a tree node to reduce the number of tree nodes. Additionally, the header table of the NFP-tree is smaller than that of the FP-tree. Therefore, the total number of nodes of all conditional trees can be reduced. Simulation results reveal that the NFP-growth algorithm is superior to the FP-growth algorithm for dense datasets and real datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agarwal, R.C., Aggarwal, C.C., Prasad, V.V.V.: A tree projection algorithm for generation of frequent itemsets. J. of Parallel and Distributed Computing 61, 350–361 (2001)
Article MATH Google Scholar
Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proc. ACM SIGMOD Intl. Conf., pp. 207–216 (1993)
Google Scholar
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc. VLDB Intl. Conf., pp. 487–499 (1994)
Google Scholar
Brin, S., Motwani, R., Ullman, J.D., Tsur, S.: Dynamic itemset counting and implication rules for market basket data. In: Proc. ACM SIGMOD Intl. Conf., pp. 255–264 (1997)
Google Scholar
Chen, M.S., Han, J., Yu, P.S.: Data mining: An overview from a database perspective. IEEE Trans. Knowledge Data Engineering 8, 866–883 (1996)
Article Google Scholar
Grahne, G., Zhu, J.: Efficiently using prefix-tree in mining frequent itemsets. In: Proc. IEEE ICDM Workshop on FIMI (2003)
Google Scholar
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proc. ACM-SIGMOD Intl. Conf., pp. 1–12 (2000)
Google Scholar
Han, J., Pei, J., Yin, Y., Mao, R.: Mining frequent patterns without candidate generation: A frequent pattern tree approach. Data Mining and Knowledge Discovery 8, 53–87 (2004)
Article MathSciNet Google Scholar
Liu, G., Lu, H., Xu, Y., Yu, J.X.: Ascending frequency ordered prefix-tree: Efficient mining of frequent patterns. In: Proc. DASFAA Intl. Conf., pp. 65–72 (2003)
Google Scholar
Liu, J., Pan, Y., Wang, K., Han, J.: Mining frequent item sets by opportunistic projection. In: Proc. ACM-SIGKDD Intl. Conf., pp. 229–238 (2002)
Google Scholar
Park, J.S., Chen, M.S., Yu, P.S.: An effective hash-based algorithm for mining association rules. In: Proc. ACM-SIGMOD Intl. Conf., pp. 175–186 (1995)
Google Scholar
Pei, J., Han, J., Lu, H., Nishio, S., Tang, S., Yang, D.: H-mine: Hyper-structure mining of frequent patterns in large databases. In: Proc. IEEE ICDM Intl. Conf., pp. 441–448 (2001)
Google Scholar
Savasere, A., Omiecinski, E., Navathe, S.: An efficient algorithm for mining association rules in large databases. In: Proc. VLDB Intl. Conf., pp. 432–444 (1995)
Google Scholar
Wang, K., Tang, L., Han, J., Liu, J.: Top down FP-growth for association rule mining. In: Proc. PAKDD Pacific-Asia Conf., pp. 334–340 (2002)
Google Scholar
Zheng, Z., Kohavi, R., Mason, L.: Real world performance of association rule algorithm. In: Proc. ACM-SIGKDD Intl. Conf., pp. 401–406 (2001)
Google Scholar
http://alme1.almaden.ibm.com/software/quest/Resources/datasets/syndata.html
http://www.cse.cuhk.edu.hk/~kdd/data/IBM_VC++.zip
http://www.ics.uci.edu/~mlearn/MLRepository.html

Download references

Author information

Authors and Affiliations

Department of Computer Science and Information Engineering, National Chung Cheng University, Chiayi, 621, Taiwan, ROC
Yu-Chiang Li & Chin-Chen Chang

Authors

Yu-Chiang Li
View author publications
You can also search for this author in PubMed Google Scholar
Chin-Chen Chang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Software, Tsinghua University,
Chi-Hung Chi
School of Software, Tsinghua University, Beijing, PR China
Kwok-Yan Lam

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, YC., Chang, CC. (2004). A New FP-Tree Algorithm for Mining Frequent Itemsets. In: Chi, CH., Lam, KY. (eds) Content Computing. AWCC 2004. Lecture Notes in Computer Science, vol 3309. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30483-8_32

Download citation

DOI: https://doi.org/10.1007/978-3-540-30483-8_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23898-0
Online ISBN: 978-3-540-30483-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics