DRFP-tree: disk-resident frequent pattern tree

Adnan, Muhaimenul; Alhajj, Reda

doi:10.1007/s10489-007-0099-2

DRFP-tree: disk-resident frequent pattern tree

Published: 19 October 2007

Volume 30, pages 84–97, (2009)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Muhaimenul Adnan¹ &
Reda Alhajj^1,2

275 Accesses
32 Citations
Explore all metrics

Abstract

Frequent itemset mining methods basically address time scalability and greatly rely on available physical memory. However, the size of real-world databases to be mined is exponentially increasing, and hence main memory size is a serious bottleneck of the existing methods. So, it is necessary to develop new methods that do not fully rely on physical memory; new methods that utilize the secondary storage in the mining process should be the target. This motivates the work described in this paper; we mainly propose (Disk Resident Frequent Pattern) DRFP-Growth as a disk based approach similar to FP-Growth. DRFP-growth uses DRFP-tree, which is treated exactly as FP-tree when constructed in main memory and gets into a modified structure when it turns into disk resident to overcome the main memory bottleneck. This way, we are able to mine for frequent itemsets from databases of arbitrary sizes without being restricted by the available physical memory. In other words, we initially try to mine the database using the original FP-growth; we expand into the secondary memory only if we run out of physical memory. So, DRFP-growth is very comparable to FP-growth for small databases and high support threshold values. On the other hand, using DRFP-growth, we are still able to mine huge databases for low support threshold values (the only limitation is the available secondary storage rather than physical memory). The reported test results demonstrate how the proposed approach succeeds for cases where main memory based approaches fail.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Adnan M, Alhajj R, Barker K (2006) Constructing complete fp-tree for incremental mining of frequent patterns in dynamic databases. In: Proceedings of the international conference on industrial & engineering applications of artificial intelligence & expert systems, Annecy, France, June 2006. Springer, Berlin,
Google Scholar
Adnan M, Alhajj R, Barker K (2006) Alternative method for incrementally constructing the fp-tree. In: Proceedings of IEEE international conference on intelligent systems, UK, September 2006
Agrawal R, Imieliski T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of ACM SIGMOD. ACM, New York, pp 207–216
Google Scholar
Agrawal R, Srikant R (1998) Fast algorithms for mining association rules, pp 580–592
Amir A, Feldman R, Kashi R (1997) A new and versatile method for association generation. Inf Sys 22(6-7):333–347
Article MATH Google Scholar
Cheung C-F, Yu JX, Lu H (2005) Constructing suffix tree for gigabyte sequences with megabyte memory. IEEE Trans Knowl Data Eng 17(1):90–105
Article Google Scholar
Cheung W, Zaiane OR (2003) Incremental mining of frequent patterns without candidate generation or support constraint. In: Proceedings of IEEE IDEAS, p 111
El-Hajj M, Zaiane OR (2004) Cofi approach for mining frequent itemsets revisited. In: Proceedings of ACM SIGMOD workshop on research issues in data mining and knowledge discovery. ACM, New York, pp 70–75
Chapter Google Scholar
Elmasri R, Navathe SB (2000) Fundamentals of database systems. Addison-Wesley, Reading
Google Scholar
Ganti V, Gehrke J, Ramakrishnan R (2001) Demon: mining and monitoring evolving data. IEEE Trans Knowl Data Eng 13(1):50–63
Article Google Scholar
Goethals B (2004) Memory issues in frequent itemset mining. In: Proceedings of ACM symposium on applied computing. ACM, New York, pp 530–534
Chapter Google Scholar
Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In: Proceedings of ACM SIGMOD, Dallas, TX, pp 1–12
Leung CK-S, Khan QI, Hoque T (2005) Cantree: a tree structure for efficient incremental mining of frequent patterns. In: Proceedings of IEEE ICDM, pp 274–281
Vaarandi R (2004) A breadth-first algorithm for mining frequent patterns from event logs. In: Proceedings of INTELLCOMM, pp 293–308
Zaki MJ (2000) Scalable algorithms for association mining. IEEE Trans Knowl Data Eng 12(3):372–390
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Calgary, Calgary, AB, Canada
Muhaimenul Adnan & Reda Alhajj
Department of Computer Science, Global University, Beirut, Lebanon
Reda Alhajj

Authors

Muhaimenul Adnan
View author publications
You can also search for this author in PubMed Google Scholar
Reda Alhajj
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Reda Alhajj.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Adnan, M., Alhajj, R. DRFP-tree: disk-resident frequent pattern tree. Appl Intell 30, 84–97 (2009). https://doi.org/10.1007/s10489-007-0099-2

Download citation

Received: 07 February 2007
Accepted: 11 September 2007
Published: 19 October 2007
Issue Date: April 2009
DOI: https://doi.org/10.1007/s10489-007-0099-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

DRFP-tree: disk-resident frequent pattern tree

Abstract

Access this article

Similar content being viewed by others

An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique

Big data analytics on Apache Spark

Big data preprocessing: methods and prospects

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

DRFP-tree: disk-resident frequent pattern tree

Abstract

Access this article

Similar content being viewed by others

An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique

Big data analytics on Apache Spark

Big data preprocessing: methods and prospects

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation