Mining top-k frequent patterns with combination reducing techniques

Pyun, Gwangbum; Yun, Unil

doi:10.1007/s10489-013-0506-9

Mining top-k frequent patterns with combination reducing techniques

Published: 22 January 2014

Volume 41, pages 76–98, (2014)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Gwangbum Pyun¹ &
Unil Yun¹

790 Accesses
41 Citations
Explore all metrics

Abstract

Top-k frequent pattern mining finds interesting patterns from the highest support to the k-th support. The approach can be effectively applied in numerous fields such as marketing, finance, bio-data analysis, and so on since it does not need constraints by a minimum support threshold. Top-k mining methods use the support of the k-th pattern, not a user-specified minimum support. Thus, the methods conduct mining operations based on very low supports until the k-th pattern is detected. When a low support is used in the mining process, single-paths with numerous items are generated, where the top-k mining algorithm extracts valid patterns by combining the items for each single-path. Therefore, the bigger the number of combinations is, the larger the increase in time and memory consumption is. In this paper, in order to mine top-k frequent patterns more efficiently, we consider converting patterns obtained from single-paths into composite patterns during the mining process and recovering them as the original patterns when the top-k frequent patterns are extracted. For this, we define a new concept, the composite pattern, and propose novel techniques for reducing pattern combinations in the single-path. Two algorithms are introduced in this paper, where the former is CRM (Combination Reducing method), applying our reduction manner, and the latter is CRMN (Combination Reducing method for N-itemset), considering N-itemset, i.e., patterns’ lengths. A performance evaluation shows that CRM and CRMN algorithms can efficiently reduce pattern combinations in single-paths compared to state-of-the-art algorithms. The experimental results also illustrate that our approaches have outstanding performance in terms of runtime, memory, and scalability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Top-K Miner: top-K identical frequent itemsets discovery without user support threshold

Article 31 December 2015

Saif-Ur-Rehman, Jawad Ashraf, … Abdus Salam

Mining N-most Interesting Multi-level Frequent Itemsets without Support Threshold

A Comparative Study of Top-K High Utility Itemset Mining Methods

References

Aggarwal CC, Li Y, Wang J, Wang J (2009) Frequent pattern mining with uncertain data. In: Knowledge discovery and data mining (KDD), Jun 2009, pp 29–38
Google Scholar
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: Proc 20th int’l conf very large databases (VLDB), pp 487–499
Google Scholar
Amphawan K, Lenca P, Surarerks A (2012) Mining top-k regular-frequent itemset using database partitioning and support estimation. Expert Syst Appl 39(2):1924–1936
Article Google Scholar
Chang L, Wang T, Yang D, Luan H (2008) SeqStream: mining closed sequential patterns over stream sliding windows. In: International conference on data mining (ICDM), Dec 2008, pp 83–92
Google Scholar
Chang L, Wang T, Yang D, Luan H, Tang S (2009) Efficient algorithms for incremental maintenance of closed sequential patterns in large databases. Data Knowl Eng 68:68–106
Article Google Scholar
Cheung YL, Fu AW (2004) Mining frequent itemsets without support threshold: with and without item constraints. IEEE Trans Knowl Data Eng 16(6):1052–1069
Article Google Scholar
Chuang KT, Huang JL, Chen MS (2008) Mining top-k frequent patterns in the presence of the memory constraint. VLDB J 17(5):1321–1344
Article Google Scholar
Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: a frequent pattern tree approach. Data Min Knowl Discov 8(1):53–87
Article MathSciNet Google Scholar
Han J, Cheng H, Xin D, Yan X (2007) Frequent pattern mining: current status and future directions. Data Min Knowl Discov 15(1):55–86
Article MathSciNet Google Scholar
Hewett R (2011) Mining software defect data to support software testing management. Appl Intell 34(2):245–257
Article Google Scholar
Jiménez A, Galiano FB, Talavera JC (2012) Mining frequent patterns from XML data: efficient algorithms and design trade-offs. Expert Syst Appl 39(1):1134–1140
Article Google Scholar
Lam HT, Calders T (2010) Mining top-k frequent items in a data stream with flexible sliding windows. In: International conference on knowledge discovery and data mining (KDD), Jul 2010, pp 283–292
Google Scholar
Lee G, Yun U, Ryu KH (2014) Sliding window based weighted maximal frequent pattern mining over data streams. Expert Syst Appl 41(2):694–708
Article Google Scholar
Li CW, Jea KF, Lin RP, Yen SF, Hsu CW (2012) Mining frequent patterns from dynamic data streams with data load management. J Syst Softw 85(6):1346–1362
Article Google Scholar
Li G, Feng J, Wang J, Zhang Y, Zhou L (2006) Incremental mining of frequent query patterns from XML queries for caching. In: International conference on data mining (ICDM), Dec 2006, pp 350–361
Chapter Google Scholar
Li H (2008) A sliding window method for finding top-k path traversal patterns over streaming web click-sequences. Expert Syst Appl 36(3):4382–4386
Article Google Scholar
Li H (2009) Interactive mining of top-k frequent closed itemsets from data streams. Expert Syst Appl 36(7):10779–10788
Article Google Scholar
Li X, Han J (2007) Mining approximate top-k subspace anomalies in multi-dimensional time-series data. In: Very large data bases (VLDB), Sep 2007, pp 447–458
Google Scholar
Lin KW, Hsieh M, Tseng VS (2010) A novel prediction-based strategy for object tracking in sensor networks by mining seamless temporal movement patterns. Expert Syst Appl 37:2799–2807
Article Google Scholar
Liu YH (2012) Mining frequent patterns from univariate uncertain data. Data Knowl Eng 71(1):47–68
Article Google Scholar
Liu YH (2013) Stream mining on univariate uncertain data. Appl Intell 39(2):315–344
Article Google Scholar
Lucchesea C, Orlando S, Perego R (2010) Mining top-k patterns from binary datasets in presence of noise. In: Proceedings of the SIAM international conference on data mining (SDM), April 2010, pp 165–176
Google Scholar
Márquez-Vera C, Cano A, Romero C, Ventura S (2013) Predicting student failure at school using genetic programming and different data mining approaches with high dimensional and imbalanced data. Appl Intell 38(3):315–330
Article Google Scholar
Muzammal M, Raman R (2011) Mining sequential patterns from probabilistic databases. In: Pacific-Asia conference on advances in knowledge discovery and data mining (PAKDD), May 2011, pp 210–221
Chapter Google Scholar
Pei J, Han J, Mao R (2000) Closet: an efficient algorithm for mining frequent closed itemsets. In: Proc ACM SIGMOD workshop research issues in data mining and knowledge discovery, May 2000, pp 21–30
Google Scholar
Priya RV, Vadivel A, Thakur RS (2012) Maximal pattern mining using fast CP-tree for knowledge discovery. Int J Inf Syst Soc Change 3(1):56–74
Article Google Scholar
Pyun G, Yun U, Ryu K (2014) Efficient frequent pattern mining based on linear prefix tree. Knowl-Based Syst 55(1):125–129
Article Google Scholar
Sallaberry A, Pecheur N, Bringay S, Roche M, Teisseire M (2011) Sequential patterns mining and gene sequence visualization to discover novelty from microarray data. J Biomed Inform 44(5):760–774
Article Google Scholar
Shie BE, Yu PS, Tseng VS (2013) Mining interesting user behavior patterns in mobile commerce environments. Appl Intell 38(3):418–435
Article Google Scholar
Tanbeer SK, Ahmed CF, Jeong BS, Lee YK (2009) Efficient single-pass frequent pattern mining using a prefix-tree. Inf Sci 179(5):559–583
Article MATH MathSciNet Google Scholar
Tanbeer SK, Ahmed CF, Jeong BS, Lee YK (2009) Sliding window-based frequent pattern mining over data streams. Inf Sci 179(22):3843–3865
Article MathSciNet Google Scholar
Tsai PS (2010) Mining top-k frequent closed itemsets over data streams using the sliding window model. Expert Syst Appl 37(10):6968–6973
Article Google Scholar
Tseng VS, Wu CW, Shie BE, Yu PS (2010) UP-growth: an efficient algorithm for high utility itemset mining. In: Knowledge discovery and data mining (KDD), July 2010, pp 253–262
Google Scholar
Wang J, Han J, Lu Y, Tzvetkov P (2005) TFP: an efficient algorithm for mining top-k frequent closed itemsets. Data Knowl Eng 17(5):652–664
Article Google Scholar
Wang YT, Cheng JT (2011) Mining periodic movement patterns of mobile phone users based on an efficient sampling approach. Appl Intell 35(1):32–40
Article Google Scholar
Wong RC, Fu AW (2006) Mining top-k frequent itemsets from data streams. Data Min Knowl Discov 13(2):193–217
Article MathSciNet Google Scholar
Xiong H, Brodie M, Ma TOP-COP S (2006) Mining TOP-k strongly correlated pairs in large databases. In: International conference on data mining (ICDM), Dec 2006, pp 1162–1166
Chapter Google Scholar
Yen SJ, Lee YS (2013) Mining non-redundant time-gap sequential patterns. Appl Intell 39(4):727–738
Article MathSciNet Google Scholar
Yoo JS, Bow M (2011) Mining top-k closed co-location patterns. In: IEEE international conference on spatial data mining and geographical knowledge services (ICSDM), June 2011, pp 100–105
Google Scholar
Yun U, Ryu KH (2010) Discovering important sequential patterns with length-decreasing weighted support constraints. Int J Inf Technol Decis Mak 9(4):575–599
Article MATH Google Scholar
Yun U, Ryu K (2011) Approximate weight frequent pattern mining with/without noisy environments. Knowl-Based Syst 24(1):73–82
Article Google Scholar
Yun U, Shin H, Ryu KH, Yoon E (2012) An efficient mining algorithm for maximal weighted frequent patterns in transactional databases. Knowl-Based Syst 33:53–64
Article Google Scholar
Yun U, Ryu K (2013) Efficient mining of maximal correlated weight frequent patterns. Intell Data Anal 17(5):917–939
Google Scholar
Yun U, Lee G, Ryu K (2014) Mining maximal frequent patterns by considering weight conditions over data streams. Knowl-Based Syst 55(1):49–65
Article Google Scholar
Vo B, Coenen F, Le B (2013) A new method for mining frequent weighted itemsets based on WIT-trees. Expert Syst Appl 40(4):1256–1264
Article Google Scholar
Zhang X, Zhang Y (2011) Sliding-window top-k pattern mining on uncertain streams. J Comput Inf Syst 7(3):984–992
Google Scholar
Zou J, Xiao J, Hou R, Wang Y (2010) Frequent instruction sequential pattern mining in hardware sample data. In: International conference on data mining (ICDM), Dec 2010, pp 1205–1210
Google Scholar

Download references

Acknowledgements

This research was supported by the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (NRF No. 2013005682 and 20080062611).

Author information

Authors and Affiliations

Department of Computer Engineering, Sejong University, Seoul, Korea
Gwangbum Pyun & Unil Yun

Authors

Gwangbum Pyun
View author publications
You can also search for this author in PubMed Google Scholar
Unil Yun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Unil Yun.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pyun, G., Yun, U. Mining top-k frequent patterns with combination reducing techniques. Appl Intell 41, 76–98 (2014). https://doi.org/10.1007/s10489-013-0506-9

Download citation

Published: 22 January 2014
Issue Date: July 2014
DOI: https://doi.org/10.1007/s10489-013-0506-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Mining top-k frequent patterns with combination reducing techniques

Abstract

Access this article

Similar content being viewed by others

Top-K Miner: top-K identical frequent itemsets discovery without user support threshold

Mining N-most Interesting Multi-level Frequent Itemsets without Support Threshold

A Comparative Study of Top-K High Utility Itemset Mining Methods

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Mining top-k frequent patterns with combination reducing techniques

Abstract

Access this article

Similar content being viewed by others

Top-K Miner: top-K identical frequent itemsets discovery without user support threshold

Mining N-most Interesting Multi-level Frequent Itemsets without Support Threshold

A Comparative Study of Top-K High Utility Itemset Mining Methods

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation