Fast Mining Maximal Frequent ItemSets Based on FP-Tree

Yan, Yuejin; Li, Zhoujun; Chen, Huowang

doi:10.1007/978-3-540-30464-7_28

Yuejin Yan²¹,
Zhoujun Li²¹ &
Huowang Chen²¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3288))

Included in the following conference series:

International Conference on Conceptual Modeling

955 Accesses
7 Citations

Abstract

Maximal frequent itemsets mining is a fundamental and important problem in many data mining applications. Since the MaxMiner algorithm introduced the enumeration trees for MFI mining in 1998, there have been several methods proposed to use depth-first search to improve performance. This paper presents FIMfi, a new depth-first algorithm based on FP-tree and MFI-tree for mining MFI. FIMfi adopts a novel item ordering policy for efficient lookaheads pruning, and a simple method for fast superset checking. It uses a variety of old and new pruning techniques to prune the search space. Experimental comparison with previous work reveals that FIMfi reduces the number of FP-trees created greatly and is more than 40% superior to the similar algorithms on average.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proceedings of the 20th VLDB Conference, Santiago, Chile (1994)
Google Scholar
Han, J., Pei, J., Yin, Y.: Mining Frequent Patterns without Candidate Generation. In: Proc. 2000 ACM-SIGMOD Int. Conf. on Management of Data (SIGMOD 2000), Dallas, TX (May 2000)
Google Scholar
Rigoutsos, L., Floratos, A.: Combinatorial pattern discovery in biological sequences: The Teiresias algorithm. Bioinformatics 14(1), 55–67 (1998)
Article Google Scholar
Liu, G., Lu, H., Yu, J.X., Wang, W., Xiao, X.: AFOPT: An Efficient Implementation of Pattern Growth Approach. In: Proceedings of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations Melbourne, Florida, USA, November 19 (2003)
Google Scholar
Bayardo, R.: Efficiently mining long patterns from databases. In: ACM SIGMOD Conference (1998)
Google Scholar
Agarwal, R., Aggarwal, C., Prasad, V.: A tree projection algorithm for generation of frequent itemsets. Journal of Parallel and Distributed Computing (2001)
Google Scholar
Burdick, D., Calimlim, M., Gehrke, J.: MAFIA: A Performance Study of Mining Maximal Frequent Itemsets. In: Proceedings of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations Melbourne, Florida, USA, November 19 (2003)
Google Scholar
Zaki, M.J., Hsiao, C.-J.: CHARM: An efficient algorithm for closed association rule mining. TR 99-10, CS Dept., RPI (October 1999)
Google Scholar
Gouda, K., Zaki, M.J.: Efficiently Mining Maximal Frequent Itemsets. In: Proc. of the IEEE Int. Conference on Data Mining, San Jose (2001)
Google Scholar
Grahne, G., Zhu, J.: Efficiently Using Prefix-trees in Mining Frequent Itemsets. In: Proceedings of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations Melbourne, Florida, USA, November 19 (2003)
Google Scholar
Goethals, B., Zaki, M.J.: FIMI 2003: Workshop on Frequent Itemset Mining Implementations. In: Proceedings of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations Melbourne, Florida, USA, November 19 (2003)
Google Scholar
Codes and datasets, available at http://fimi.cs.helsinki.fi/

Download references

Author information

Authors and Affiliations

School of Computer Science, National University of Defense Technology, Changsha, 410073, China
Yuejin Yan, Zhoujun Li & Huowang Chen

Authors

Yuejin Yan
View author publications
You can also search for this author in PubMed Google Scholar
Zhoujun Li
View author publications
You can also search for this author in PubMed Google Scholar
Huowang Chen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Informatica e Automazione, Università Roma Tre, Via Vasca Navale 79, 00146, Roma, Italy
Paolo Atzeni
Computer Science Department, University of California, 3731 Boelter Hall, 90095, Los Angeles, CA, USA
Wesley Chu
Department of Computer Science, Tsinghua University, 100084, Beijing, P.R. China
Hongjun Lu
Department of Computer Science and Engineering, Fudan University, 200433, China
Shuigeng Zhou
School of Computing, National University of Singapore,
Tok-Wang Ling

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yan, Y., Li, Z., Chen, H. (2004). Fast Mining Maximal Frequent ItemSets Based on FP-Tree. In: Atzeni, P., Chu, W., Lu, H., Zhou, S., Ling, TW. (eds) Conceptual Modeling – ER 2004. ER 2004. Lecture Notes in Computer Science, vol 3288. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30464-7_28

Download citation

DOI: https://doi.org/10.1007/978-3-540-30464-7_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23723-5
Online ISBN: 978-3-540-30464-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics