Abstract
Data mining has become an important field and has been applied extensively across many areas. Mining frequent itemsets in a transaction database is critical for mining association rules. Many investigations have estabilished that pattern-growth method outperforms the method of Apriori-like candidate generation. The performance of the pattern-growth method depends on the number of tree nodes. Accordingly, this work presents a new FP-tree structure (NFP-tree) and develops an efficient approach for mining frequent itemsets, based on an NFP-tree, called the NFP-growth approach. NFP-tree employs two counters in a tree node to reduce the number of tree nodes. Additionally, the header table of the NFP-tree is smaller than that of the FP-tree. Therefore, the total number of nodes of all conditional trees can be reduced. Simulation results reveal that the NFP-growth algorithm is superior to the FP-growth algorithm for dense datasets and real datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agarwal, R.C., Aggarwal, C.C., Prasad, V.V.V.: A tree projection algorithm for generation of frequent itemsets. J. of Parallel and Distributed Computing 61, 350–361 (2001)
Agrawal, R., Imielinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proc. ACM SIGMOD Intl. Conf., pp. 207–216 (1993)
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc. VLDB Intl. Conf., pp. 487–499 (1994)
Brin, S., Motwani, R., Ullman, J.D., Tsur, S.: Dynamic itemset counting and implication rules for market basket data. In: Proc. ACM SIGMOD Intl. Conf., pp. 255–264 (1997)
Chen, M.S., Han, J., Yu, P.S.: Data mining: An overview from a database perspective. IEEE Trans. Knowledge Data Engineering 8, 866–883 (1996)
Grahne, G., Zhu, J.: Efficiently using prefix-tree in mining frequent itemsets. In: Proc. IEEE ICDM Workshop on FIMI (2003)
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proc. ACM-SIGMOD Intl. Conf., pp. 1–12 (2000)
Han, J., Pei, J., Yin, Y., Mao, R.: Mining frequent patterns without candidate generation: A frequent pattern tree approach. Data Mining and Knowledge Discovery 8, 53–87 (2004)
Liu, G., Lu, H., Xu, Y., Yu, J.X.: Ascending frequency ordered prefix-tree: Efficient mining of frequent patterns. In: Proc. DASFAA Intl. Conf., pp. 65–72 (2003)
Liu, J., Pan, Y., Wang, K., Han, J.: Mining frequent item sets by opportunistic projection. In: Proc. ACM-SIGKDD Intl. Conf., pp. 229–238 (2002)
Park, J.S., Chen, M.S., Yu, P.S.: An effective hash-based algorithm for mining association rules. In: Proc. ACM-SIGMOD Intl. Conf., pp. 175–186 (1995)
Pei, J., Han, J., Lu, H., Nishio, S., Tang, S., Yang, D.: H-mine: Hyper-structure mining of frequent patterns in large databases. In: Proc. IEEE ICDM Intl. Conf., pp. 441–448 (2001)
Savasere, A., Omiecinski, E., Navathe, S.: An efficient algorithm for mining association rules in large databases. In: Proc. VLDB Intl. Conf., pp. 432–444 (1995)
Wang, K., Tang, L., Han, J., Liu, J.: Top down FP-growth for association rule mining. In: Proc. PAKDD Pacific-Asia Conf., pp. 334–340 (2002)
Zheng, Z., Kohavi, R., Mason, L.: Real world performance of association rule algorithm. In: Proc. ACM-SIGKDD Intl. Conf., pp. 401–406 (2001)
http://alme1.almaden.ibm.com/software/quest/Resources/datasets/syndata.html
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Li, YC., Chang, CC. (2004). A New FP-Tree Algorithm for Mining Frequent Itemsets. In: Chi, CH., Lam, KY. (eds) Content Computing. AWCC 2004. Lecture Notes in Computer Science, vol 3309. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30483-8_32
Download citation
DOI: https://doi.org/10.1007/978-3-540-30483-8_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23898-0
Online ISBN: 978-3-540-30483-8
eBook Packages: Springer Book Archive