Abstract
Frequent itemsets, also called frequent patterns, are important information about databases, and mining efficiently frequent itemsets is a core problem in data mining area. Pattern growth approaches, such as the classic FP-Growth algorithm and the efficient FPgrowth* algorithm, can solve the problem. The approaches mine frequent itemsets by constructing recursively conditional databases that are usually represented by prefix-trees. The three major costs of such approaches are prefix-tree traversal, support counting, and prefix-tree construction. This paper presents a novel pattern growth algorithm called BFP-growth in which the three costs are greatly reduced. We compare the costs among BFP-growth, FP-Growth, and FPgrowth*, and illuminate that the costs of BFP-growth are the least. Experimental data show that BFP-growth outperforms not only FP-Growth and FPgrowth* but also several famous algorithms including dEclat and LCM, ones of the fastest algorithms, for various databases.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Ceglar, A., Roddick, J.F.: Association mining. ACM Comput. Surv. 38(2), 1–42 (2006)
Wang, H., Wang, W., Yang, J., Yu, P.S.: Clustering by pattern similarity in large data sets. In: Proc. ACM SIGMOD, pp. 394–405 (2002)
Cheng, H., Yan, X., Han, J., Yu, P.S.: Direct discriminative pattern mining for effective classification. In: Proc. ICDE, pp. 169–178 (2008)
Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proc. ACM SIGMOD, pp. 207–216 (1993)
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Proc. VLDB, pp. 487–499 (1994)
Savasere, A., Omiecinski, E., Navathe, S.B.: An efficient algorithm for mining association rules in large databases. In: Proc. VLDB, pp. 432–444 (1995)
Bastide, Y., Taouil, R., Pasquier, N., Gerd, S., Lakhal, L.: Mining frequent patterns with counting inference. SIGKDD Explor. Newsl. 2(2), 66–75 (2000)
Han, J., Pei, J., Yin, Y., Mao, R.: Mining frequent patterns without candidate generation: A frequent-pattern tree approach*. Data Min. Knowl. Disc. 8(1), 53–87 (2004)
Zaki, M.J.: Scalable algorithms for association mining. IEEE Trans. Knowl. Data Eng. 12(3), 372–390 (2000)
Song, M., Rajasekaran, S.: A transaction mapping algorithm for frequent itemsets mining. IEEE Trans. Knowl. Data Eng. 18(4), 472–481 (2006)
Zaki, M.J., Gouda, K.: Fast vertical mining using diffsets. In: Proc. ACM SIGKDD, pp. 326–335 (2003)
Tsay, Y.J., Hsu, T.J., Yu, J.R.: Fiut: A new method for mining frequent itemsets. Inf. Sci. 179(11), 1724–1737 (2009)
Ghoting, A., Buehrer, G., Parthasarathy, S., Kim, D., Nguyen, A., Chen, Y.K., Dubey, P.: Cache-conscious frequent pattern mining on modern and emerging processors. The VLDB Journal 16(1), 77–96 (2007)
Schlegel, B., Gemulla, R., Lehner, W.: Memory-efficient frequent-itemset mining. In: Proc. EDBT, pp. 461–472 (2011)
Uno, T., Kiyomi, M., Arimura, H.: Lcm ver. 2: Efficient mining algorithms for frequent/closed/maximal itemsets. In: Proc. IEEE ICDM Workshop FIMI (2004)
Grahne, G., Zhu, J.: Fast algorithms for frequent itemset mining using fp-trees. IEEE Trans. Knowl. Data Eng. 17(10), 1347–1362 (2005)
Liu, G., Lu, H., Yu, J.X., Wang, W., Xiao, X.: Afopt: An efficient implementation of pattern growth approach. In: Proc. IEEE ICDM Workshop FIMI (2003)
Liu, G., Lu, H., Lou, W., Xu, Y., Yu, J.X.: Efficient mining of frequent patterns using ascending frequency ordered prefix-tree. Data Min. Knowl. Disc. 9(3), 249–274 (2004)
Schmidt-thieme, L.: Algorithmic features of eclat. In: Proc. IEEE ICDM Workshop FIMI (2004)
FP-Growth Implementation, http://adrem.ua.ac.be/~goethals/software/
Frequent Itemset Mining Implementations Repository, http://fimi.ua.ac.be/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Qu, JF., Liu, M. (2012). A High-Performance Algorithm for Frequent Itemset Mining. In: Gao, H., Lim, L., Wang, W., Li, C., Chen, L. (eds) Web-Age Information Management. WAIM 2012. Lecture Notes in Computer Science, vol 7418. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32281-5_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-32281-5_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32280-8
Online ISBN: 978-3-642-32281-5
eBook Packages: Computer ScienceComputer Science (R0)