Abstract
Mining frequent patterns has been studied popularly in data mining research. Most of the current studies adopt an FP_growth-like approach which does not bring the candidate generation. However, the cost of recursively constructing each frequent item’s conditional frequent pattern tree is high. In this paper, we propose a depth first algorithm for mining frequent patterns. Efficiency of mining is achieved with the following techniques: large database is compressed into a frequent pattern tree with a children table but not a header table, which avoids costly repeated database scans, on the other hand the mining algorithm adopts a depth first method which takes advantage of this tree structure and dynamically adjusts links instead of generating a lot of redundant sub trees, which can dramatically reduces the time and space needed for the mining process. The performance study shows that our algorithm is efficient and scalable for mining frequent patterns, and is an order of magnitude faster than Trie, FP_growth, H-mine and some recently reported new frequent patterns mining methods.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Agrawal, R., Imielinski, T., Swami, A.: Mining Association rules between Sets of Items in Large Databases. In: Intl Proc of the 1993 ACM SIGMOD, Washington D.C, pp. 207–216 (1993)
Han, J., Pei, J., Yin, Y.: Mining Frequent Patterns without Candidate Generation. In: Intl Proc of the 2000 ACM SIGMOD, Dallas, pp. 1–12 (2000)
Agrawal, R., Strikant, R.: Fast Algorithms for mining association rules. In: 20th Intl Proc of the 1994 VLDB, Santiago, pp. 487–499 (1994)
Pei, J., Han, J., Lu, H., et al.: H-Mine: Hyper-Structure Mining of Frequent Patterns in Large Databases. In: Intl Proc of the 2001 ICDM, San Jose, pp. 441–448 (2001)
Park, J., Chen, M., Yu, P.S.: An Effective Hash-Based Algorithm for Mining Association rules. In: Intl Proc of the 1995 ACM SIGMOD, San Jose, pp. 175–186 (1995)
Bodon, F.: A Fast Apriori Implementation. In: Proc of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations Repository. Melbourne, Florida (2003)
Rozenberg, B., Gudes, E.: Association Rules Mining in Vertically Partitioned Databases. Data & Knowledge Engineering 59(2), 378–396 (2006)
Palashikar, G.K., Kale, M.S., Apte, M.M.: Association Rules Mining Using Heavy Itemsets. Data & Knowledge Engineering (2006)
Chen, T., Hsu, S.: Mining Frequent Tree-Like Patterns in large datasets. Data & Knowledge Engineering (2006)
Xin, D., Han, J., Yan, X., Cheng, H.: On Compressing Frequent Patterns. Data & Knowledge Engineering (2006)
Savasere, A., Omiecinski, E., Navathe, S.: An Efficient Algorithm for Mining Association Rules in Large Databases. In: 21th Intl Proc of the 1995 VLDB. San Francisco, pp. 432-444 (1995)
Toivonen, H.: Sampling Large Databases for Association Rules. In: 22nd Intl Proc of the 1996 VLDB, Bombay, pp. 134–145 (1996)
Brin, S., Motwani, R., Ullman, J.D., Tsur, S.: Dynamic Itemset Counting and Implication Rules for Market Basket Data. In: Intl Proc of the 1997 ACM-SIGMOD, pp. 255–264. New York (1997)
Agarwal, R., Aggarwal, C., Prasad, V.: Depth First Generation of Long Patterns. In: Ramakrishnan, R., Stolfo, S. (eds.) 6th Intl Proc. of 2000 ACM SIGKDD, Boston, pp. 108–118 (2000)
Goethals, B., Zaki, M.J.: Advances in frequent itemset mining implementations. Report on FIMI 2003. SIGKDD Explorations 6(1), 109–117 (2003)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhu, Q., Lin, X. (2007). Depth First Generation of Frequent Patterns Without Candidate Generation. In: Washio, T., et al. Emerging Technologies in Knowledge Discovery and Data Mining. PAKDD 2007. Lecture Notes in Computer Science(), vol 4819. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77018-3_38
Download citation
DOI: https://doi.org/10.1007/978-3-540-77018-3_38
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-77016-9
Online ISBN: 978-3-540-77018-3
eBook Packages: Computer ScienceComputer Science (R0)