Top-K Miner: top-K identical frequent itemsets discovery without user support threshold

Saif-Ur-Rehman; Ashraf, Jawad; Habib, Asad; Salam, Abdus

doi:10.1007/s10115-015-0907-7

Top-K Miner: top-K identical frequent itemsets discovery without user support threshold

Regular Paper
Published: 31 December 2015

Volume 48, pages 741–762, (2016)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Saif-Ur-Rehman¹,
Jawad Ashraf¹,
Asad Habib¹ &
…
Abdus Salam²

428 Accesses
9 Citations
Explore all metrics

Abstract

Frequent itemsets (FIs) mining is a prime research area in association rule mining. The customary techniques find FIs or its variants on the basis of either support threshold value or by setting two generic parameters, i.e., N (topmost itemsets) and \(K_\mathrm{{max}}\) (size of the itemsets). However, users are unable to mine the absolute desired number of patterns because they tune these approaches with their approximate parameters settings. We proposed a novel technique, top-K Miner that does not require setting of support threshold, N and \(K_\mathrm{{max}}\) values. Top-K Miner requires the user to specify only a single parameter, i.e., K to find the desired number of frequent patterns called identical frequent itemsets (IFIs). Top-K Miner uses a novel candidate production algorithm called join-FI algorithm. This algorithm uses frequent 2-itemsets to yield one or more candidate itemsets of arbitrary size. The join-FI algorithm follows bottom-up recursive technique to construct candidate-itemsets-search tree. Finally, the generated candidate itemsets are manipulated by the Maintain-Top-K_List algorithm to produce Top-K_List of the IFIs. The proposed top-K Miner algorithm significantly outperforms the generic benchmark techniques even when they are running with the ideal parameters settings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique

Article 12 April 2024

Rashmin Gajera, Suresh Patel, … Ayush Solanki

A survey of density based clustering algorithms

Article 29 September 2020

Panthadeep Bhattacharjee & Pinaki Mitra

A review on design inspired subsampling for big data

Article 13 February 2023

Jun Yu, Mingyao Ai & Zhiqiang Ye

References

Agrawal R, Imielinski T, Swami A (1993) Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM-SIGMOD international conference on management of data (SIGMOD’93). Washington, DC, pp 207–216
Grahne G, Zhu J (2003) High performance mining of maximal frequent itemsets. In: Proceeding of the 2003 SIAM international workshop on high performance data mining. pp 135–143
Lee W, Stolfo SJ, Mok KW (2000) Adaptive intrusion detection: a data mining approach. Artif Intell Rev 14(6):533–567
Article MATH Google Scholar
Pei J, Han J, Mortazavi-Asl B, Zhu H (2000) Mining access patterns efficiently from web logs. In: Proceeding of the 2000 Pacific-Asia conference on knowledge discovery and data mining. Kyoto, Japan, pp 396–407
Holt JD, Chung SM (1999) Efficient mining of association rules in text databases. In: Proceeding of the 1999 international conference on Information and knowledge management. Kansas City, Missouri, pp 234–242
Klemettinen M (1999) A knowledge discovery methodology for telecommunication network alarm databases. Ph.D. thesis, University of Helsinki
Satou K, Shibayama G, Ono T, Yamamura Y, Furuichi E, Kuhara S, Takagi T (1997) Finding associations rules on heterogeneous genome data. In: Proceeding of the 1997 Pacific symposium on biocomputing (PSB’97). Hawaii, pp 397–408
Bayardo RJ (1998) Efficiently mining long patterns from databases. In: Proceeding of the 1998 ACM-SIGMOD international conference on management of data (SIGMOD’98). Seattle, WA, pp 85–93
Burdick D, Calimlim M, Flannick J, Gehrke J, Yiu T (2005) MAFIA: a maximal frequent itemset algorithm. IEEE Trans Knowl Data Eng 17(11):1490–1504
Article Google Scholar
Gouda K, Zaki MJ (2005) GenMax: an efficient algorithm for mining maximal frequent itemsets. Data Min Knowl Discov 11(3):1–20
Article MathSciNet Google Scholar
Pasquier N, Bastide Y, Taouil R, Lakhal L (1999) Discovering frequent closed itemsets for association rules. In: Proceeding of the 7th international conference on database theory (ICDT’99). Jerusalem, Israel, pp 398–416
Pei J, Han J, Mao R (2000) CLOSET: an efficient algorithm for mining frequent closed itemsets. In: Proceeding of the 2000 ACM-SIGMOD international workshop data mining and knowledge discovery (DMKD’00). Dallas, TX, pp 11–20
Zaki MJ, Hsiao CJ (2002) CHARM: an efficient algorithm for closed itemset mining. In: Proceeding of the 2002 SIAM international conference on data mining (SDM’02). Arlington, VA, pp 457–473
Borgelt C, Yang X, Nogales-Cadenas R, Carmona-Saez P, Pascual-Montano A (2011) Finding closed frequent item sets by intersecting transactions. In: Proceedings of the 2011 international conference on extending database technology (EDBT-11). Sweden, Uppsala, pp 367–376
Hu T, Sung SY, Xiong H, Fu Q (2008) Discovery of maximum length frequent itemsets. Inf Sci Int J 178(1):69–87
MathSciNet Google Scholar
Zhu F, Yan X, Han J, Yu PS, Cheng H (2007) Mining colossal frequent patterns by core pattern fusion. In: Proceeding of the 2007 international conference on data engineering (ICDE’07). Istanbul, Turkey, pp 706–715
Dabbiru M, Shashi M (2010) An efficient approach to colossal pattern mining. Int J Comput Sci Netw Secur (IJCSNS) 10(1):304–312
Google Scholar
Sohrabi MK, Barforoush AA (2012) Efficient colossal pattern mining in high dimensional datasets. Knowl Based Syst 33:41–52
Article Google Scholar
Han J, Pei J, Yin Y (2000) Mining frequent patterns without candidate generation. In: Proceeding of the 2000 ACM-SIGMOD international conference on management of data (SIGMOD’00). Dallas, TX, pp 1–12
Han J, Cheng H, Xin D, Yan (2007) Frequent pattern mining—current status and future directions. Data Min Knowl Discov 15(1):55–86
Article MathSciNet Google Scholar
Cheung YL, Fu AWC (2004) Mining frequent itemsets without support threshold: with and without item constraints. IEEE Trans Knowl Data Eng 16(9):1052–1069
Article Google Scholar
Fu AWC, Kwong RWW, Tang J (2000) Mining N-most interesting itemsets. In: Proceedings of the 2000 international symposium on methodologies for intelligent systems. pp 59–67
Ngan SC, Lam T, Wong RCW, Fu AWC (2005) Mining N-most interesting itemsets without support threshold by the COFI-tree. Int J Bus Intell Data Min 1(1):88–106
Article Google Scholar
El-Hajj M, Zaïane OR (2003) COFI-tree mining: a new approach to pattern growth with reduced candidacy generation. In: Workshop on frequent itemset mining implementations (FIMI 2003) in conjunction with IEEE-ICDM
Salam A, Khayal M (2011) Mining top-k frequent patterns without minimum support threshold. Knowl Inf Syst 30(1):112–142
Google Scholar
Li Y, Lin Q, Li R, Duan D (2010) TGP: mining top-K frequent closed graph pattern without minimum support. In: Proceeding of the 2010 international conference on advanced data mining and applications (ADMA ’10). pp 537–548
Xie Y, Yu PS (2010) Max-Clique: a top-down graph-based approach to frequent pattern mining. In: Proceeding of the 2010 IEEE international conference on data mining (ICDM ’10). pp 1139–1144
Okubo Y, Haraguchi M (2012) Finding top-N colossal patterns based on clique search with dynamic update of graph. In: Proceeding of the 2012 international conference on formal concept analysis (ICFCA’12). Springer, pp 244–259
Agrawal R, Mannila H, Srikant R, Toivonen H, Verkamo A (1996) Fast discovery of association rules. In: Fayyad UM, Piatetsky G, Smyth P, Uthurusamy R (eds) Advances in KDD. MIT press
Holsheimer M, Kersten M, Mannila H, Toivonen H (1995) A perspective on database and data mining. In: Proceeding of the 1995 international conference on knowledge discovery and data mining (KDD’ 95). pp 150–155
Zaki MJ, Gouda K (2003) Fast vertical mining using diffsets. In: Proceedings of the 2003 ACM-SIGKDD international conference on knowledge discovery and data mining (SIGKDD’03). Washington, pp 326–335
Frequent itemset mining implementations repository. http://fimi.cs.helsinki.fi/
Shen L, Shen H, Pritchard P, Topor R (1998) Finding the N largest itemsets. In: Proceedings of international conference on data mining. pp 211–222
Quang TM, Oyanagi S, Yamazaki K (2006) ExMiner: an efficient algorithm for mining top-K frequent patterns, ADMA 2006, LNAI 4093. pp 436–447
Wang J, Han J (2005) TFP: an efficient algorithm for mining top-K frequent closed itemsets. IEEE Trans Knowl Data Eng 17(5):652–664
Article Google Scholar
Hirate Y, Iwahashi E, Yamana H (2004) TF2P-growth: an efficient algorithm for mining frequent patterns without any thresholds. In: Proceedings of ICDM

Download references

Author information

Authors and Affiliations

Institute of Information Technology, Kohat University of Science and Technology, Kohat, Pakistan
Saif-Ur-Rehman, Jawad Ashraf & Asad Habib
Computer Science Department, Abasyn University, Peshawar, Pakistan
Abdus Salam

Authors

Saif-Ur-Rehman
View author publications
You can also search for this author in PubMed Google Scholar
Jawad Ashraf
View author publications
You can also search for this author in PubMed Google Scholar
Asad Habib
View author publications
You can also search for this author in PubMed Google Scholar
Abdus Salam
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Saif-Ur-Rehman.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Saif-Ur-Rehman, Ashraf, J., Habib, A. et al. Top-K Miner: top-K identical frequent itemsets discovery without user support threshold. Knowl Inf Syst 48, 741–762 (2016). https://doi.org/10.1007/s10115-015-0907-7

Download citation

Received: 14 July 2014
Revised: 12 September 2015
Accepted: 24 November 2015
Published: 31 December 2015
Issue Date: September 2016
DOI: https://doi.org/10.1007/s10115-015-0907-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Top-K Miner: top-K identical frequent itemsets discovery without user support threshold

Abstract

Access this article

Similar content being viewed by others

An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique

A survey of density based clustering algorithms

A review on design inspired subsampling for big data

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Top-K Miner: top-K identical frequent itemsets discovery without user support threshold

Abstract

Access this article

Similar content being viewed by others

An efficient join operations for utility list-based high-utility mining approaches using hybrid search technique

A survey of density based clustering algorithms

A review on design inspired subsampling for big data

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation