Efficient mining frequent itemsets algorithms

Mohamed, Marghny H.; Darwieesh, Mohammed M.

doi:10.1007/s13042-013-0172-6

Efficient mining frequent itemsets algorithms

Original Article
Published: 26 May 2013

Volume 5, pages 823–833, (2014)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Marghny H. Mohamed¹ &
Mohammed M. Darwieesh²

766 Accesses
9 Citations
Explore all metrics

Abstract

Efficient algorithms for mining frequent itemsets are crucial for mining association rules as well as for many other data mining tasks. It is well known that countTable is one of the most important facility to employ subsets property for compressing the transaction database to new lower representation of occurrences items. One of the biggest problem in this technique is the cost of candidate generation and test processing which are the two most important steps to find association rules. In this paper, we have developed this method to avoid the costly candidate-generation-and-test processing completely. Moreover, the proposed methods also compress crucial information about all itemsets, maximal length frequent itemsets, minimal length frequent itemsets, avoid expensive, and repeated database scans. The proposed named CountTableFI and BinaryCountTableF are presented, the algorithm has significant difference from the Apriori and all other algorithms extended from Apriori. The idea behind this algorithm is in the representation of the transactions, where, we represent all transactions in binary number and decimal number, so it is simple and fast to use subset and identical set properties. A comprehensive performance study shows that our techniques are efficient and scalable comparing with other methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Frawley W, Piatetsky-Shapiro G, Matheus C (1992) Knowledge discovery in databases: an overview. AI Mag 13(3):57–70
Google Scholar
Han J, Kamber M (2006) Data mining: concepts and techniques. 2nd edn. Morgan Kaufmann, San Francisco
Agrawal R, Imielinski T, Swami A (1993) Mining associations between sets of items in large databases. In: Proceedings of the ACM-SIGMOD 1993 international conference on management of data. Washington D.C., USA
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: Proceedings of the 20th international conference on very large data bases. Chile, pp 487–499
Savasere A, Omiecinski E, Navathe S (1995) An efficient algorithm for mining association rules in large databases. In: Proceedings of the 1995 international conference very large data bases (VLDB’95), Zurich, Switzerland, pp 432–443
Park JS, Chen MS, Yu PS (1995) An effective hash-based algorithm for mining association rules. In: Proceedings of 1995 ACM-SIGMOD internationa conference management of data (SIGMOD’95), San Jose, pp 175–186
Lent B, Swami A, Widom J (1997) Clustering association rules. In Proc. 1997 Int. Conf. Data Engineering (ICDE’97), 220–231, Birmingham, England
Pei J (2002) Pattern-grouth methods for frequent pattern mining. Ph.D. Thesis
Gouda K, Zaki MJ (2005) GenMax: an efficient algorithm for mining maximal frequent itemsets. Data Min Knowl Discov 11(3):223–242
Article MathSciNet Google Scholar
Zaki MJ (2000) Scalable algorithms for association mining. IEEE Trans Knowl Data Eng 12(3):372–390
Article MathSciNet Google Scholar
Goethals B (2003) Survey on frequent pattern mining. Techinqcal report
Ceglar A, Roddick JF (2006) Association mining. ACM Computing Surveys 38(2), Article 5
Zhao Q, Bhowmick SS (2003) Association rule mining: a survey. Technical Report, CAIS, Nanyang Technological University, Singapore, No. 2003116
Zaki MJ, Parthasarathy S, Ogihara M, Li W (1997) New algorithms for fast discovery of association rules. In: Proceedings of the third international conference on knowledge discovery and data mining, AAAI Press, pp 283–286
Brin S, Motwani R, Ullman J, Tsur S (1997) Dynamic itemset counting and implication rules for market basket data. In: ACM SIGMOD conference management of data
Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In: Proceedings of ACM SIGMOD conference on management of data. ACM, Dallas, pp 1–12
Zaki MJ, Gouda K (2003) Fast vertical mining using diffsets. In: KDD ’03: Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, New York, pp 326–335
Dong J, Han M (2007) BitTableFI: an efficient mining frequent itemsets algorithm. Knowl Based Syst 20(4):329–335
Article Google Scholar
Song W, Yang B, Xu Z (2008) Index-BitTableFI: an improved algorithm for mining frequent itemsets. Knowl Based Syst 21:507–513
Article Google Scholar
Zaki MJ (1999) Parallel and distributed assiocation mining: a survey. IEEE Concurr 7(4):14–25
Article Google Scholar
Toivonen H (1996) Sampling large databases for association rules. In: Proceedings of the 1996 international conference very large data bases (VLDB’96). Bombay, India, pp 134–145
Pei J, Han J, Lu H, Nishio S, Tang S, Yang D (2001) Hmine: hyper-structure mining of frequent patterns in large databases. In: Proceedings of IEEE international conference on data mining, pp 441–448
Pietracaprina A, Zandolin D (2003) Mining frequent itemsets using Patricia Tries. FIMI ’03, frequent itemset mining implementations. In: Proceedings of the ICDM 2003 workshop on frequent itemset mining implementations. Melbourne
Tsay YJ, Chiang JY (2005) CBAR: an efficient method for mining association rules. Knowl Based Syst 18(2–3):99–105
Article Google Scholar
Tsay YJ, Chiang JY (2004) CDAR: an efficient cluster and decomposition algorithm for mining association rules. Inform Sci 160:161–171
Article Google Scholar
(2004) Workshop on freqent itemset mining implementations (FIMI’04). http://fimi.cs.helsinki.fi
Bayarda RJ (1998) Efficiently mining long patterns from databases. In: Proceedings of the ACM SIGMOD international conference on management of data. Seattle, WA, pp 85–93
Lin DI, Kedem ZM (1997) Pincer-search: a new algorithm for discovering the maximum frequent set. In Schek H, Saltor F, Ramos I, Alonso G (eds). Proceedings of advances in database technology (EDBT ’98), 6th international conference on extending database technology, Valencia, Spain. Lecture Notes in Computer Science, 1377. Springer, Berlin, pp 105–119
Park JS, Chen MS, Yu PS (1997) Using a hash-based method with transaction trimming for mining association rules. IEEE Trans Knowl Data Eng 9(5)
Zaki MJ (2004) Mining non-redundant association rules. Data Min Knowl Discov 9(3):223–248
Article MathSciNet Google Scholar
Wang HX (2004) Demand-driven frequent itemset mining using pattern structures. Knowl Inform Syst 8(1):82–102
Article Google Scholar
Agarwal R, Aggarwal C, Prasad VVV (2000) A tree projection algorithm for generation of frequent itemsets. In J Parallel Distrib Comput (Special Issue on High Performance Data Mining) 61:350–371
Liu G, Lu H, Lou W, Xu Y, Yu JX (2004) Efficient mining of frequent patterns using ascending frequency ordered prefix-tree. Data Min Knowl Discov 9(3):249–274
Article MathSciNet Google Scholar
Grahne G, Zhu J (2005) Fast algorithms for frequent itemset mining using FP-Trees. IEEE Trans Knowl Data Eng 17(10):1347–1362
Article Google Scholar
Gopalan R, Sucahyo YG (2002) ITL-Mine: mining frequent itemsets more efficiently. In: Proceedings of 2002 international conference on fuzzy systems and knowledge fiscovery, Singapore
Gopalan R, Sucahyo YG (2002) TreeITL-Mine: mining frequent itemsets using pattern growth, tid intersection and prefix tree. In: Proceedings of 15th Australian joint conference on artificial intelligence, Canberra, Australia. Lecture Notes on Artificial Intelligence, 2557. Springer, Melbourne
Sucahyo YG, Gopalan R (2003) CT-ITL: Efficient frequent item set mining using a compressed prefix tree with pattern growth. In; Proceedings of 14th Australasian database conference, Adelaide, Australia
Gopalan R, Sucahyo YG (2003) Fast Frequent itemset mining using compressed data representation. In: Proceedings of IASTED international conference on databases and applications (DBA’2003). Innsbruck, Austria, Feb 10–13
Gopalan R, Sucahyo YG (2003) Improving the efficiency of frequent pattern mining by compact data structure design. In: Proceedings of fourth international conference on intelligent data engineering and automated learning (IDEAL). Hong Kong, March 21–23, LNCS, Springer
Gopalan R, Sucahyo YG (2004) High performance frequent patterns extraction using compressed FP-Tree. In: Proceedings of the SIAM international workshop on high performance and distributed mining. Orlando, USA
Sucahyo YG, Gopalan R (2004) CT-PRO: A bottom–up non recursive frequent itemset mining algorithm using compressed FP-Tree data structure. In: Proceedings of the IEEE ICDM workshop on frequent itemset mining implementations (FIMI). Brighton, UK
Grahne G, Zhu J (2003) Efficiently using prefix-trees in mining frequent itemsets, FIMI ’03, Frequent Itemset Mining Implementations. In: Proceedings of the ICDM 2003 workshop on frequent itemset mining implementations. Melbourne
Song M, Rajasekaran S (2006) A transaction mapping algorithm for frequent itemsets mining. IEEE Trans Knowl Data Eng 18(4): 472–481
Google Scholar
Holt JD, Chung SM (2002) Mining association rules using inverted hashing and pruning. Inform Process Lett 83(4):211–220
Article MATH MathSciNet Google Scholar
Ahmed S, Coenen F, Leng P (2006) Tree-based partitioning of data for association rule mining. Knowl Inf Syst 10(3):315–331
Article Google Scholar
Wang T, He P (2006) Database encoding and a new algorithm for association rules mining. J Commun Comput 3(3):77–81
Google Scholar
Fu-zan C, Min-qiang L (2008) Efficient algorithm based on itemset-lattice and bitmap index for finding frequent itemsets. Syst Eng Theory Prac 28(2):26–34
Google Scholar
Fakhrahmad SM, Zolghadr Jahromi M, Sadreddini MH (2007) Mining frequent itemsets in large data warehouses: a novel approach proposed for sparse data sets. In: Yin H et al (eds) IDEAL. LNCS, 4881, pp 517–526
Vagin V, Fomina M (2011) Problem of knowledge discovery in noisy databases. Int J Mach Learn Cybern 2(3):135–145
Article Google Scholar
Gu S-M, Wu W-Z (2012) On knowledge acquisition in multiscale decision systems. Int J Mach Learn Cybern :1–10. doi:10.1007/s13042-012-0115-7

Download references

Author information

Authors and Affiliations

Faculty of Computers and Information, Assiut University, Assiut, Egypt
Marghny H. Mohamed
Mathematics Department, Faculty of Science, Assiut University, Assiut, Egypt
Mohammed M. Darwieesh

Authors

Marghny H. Mohamed
View author publications
You can also search for this author inPubMed Google Scholar
Mohammed M. Darwieesh
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Marghny H. Mohamed.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mohamed, M.H., Darwieesh, M.M. Efficient mining frequent itemsets algorithms. Int. J. Mach. Learn. & Cyber. 5, 823–833 (2014). https://doi.org/10.1007/s13042-013-0172-6

Download citation

Received: 07 March 2012
Accepted: 29 April 2013
Published: 26 May 2013
Issue Date: December 2014
DOI: https://doi.org/10.1007/s13042-013-0172-6

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient mining frequent itemsets algorithms

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Study of Effective Mining Algorithms for Frequent Itemsets

SS-FIM: Single Scan for Frequent Itemsets Mining in Transactional Databases

A Survey on Representation for Itemsets in Association Rule Mining

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Efficient mining frequent itemsets algorithms

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Study of Effective Mining Algorithms for Frequent Itemsets

SS-FIM: Single Scan for Frequent Itemsets Mining in Transactional Databases

A Survey on Representation for Itemsets in Association Rule Mining

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now