Abstract
High-utility itemset mining (HUIM) is a critical issue in recent years since it can be used to reveal the profitable products by considering both the quantity and profit factors instead of frequent itemset mining (FIM) or association-rule mining (ARM). Several algorithms have been presented to mine high-utility itemsets (HUIs) and most of them have to handle the exponential search space for discovering HUIs when the number of distinct items and the size of database are very large. In the past, a heuristic HUPE\( _\mathrm{umu}\)-GRAM algorithm was proposed to mine HUIs based on genetic algorithm (GA). For the evolutionary computation (EC) techniques of particle swarm optimization (PSO), it only requires fewer parameters compared to the GA-based approaches. Since the traditional PSO mechanism is used to handle the continuous problem, in this paper, the discrete PSO is adopted to encode the particles as the binary variables. An efficient PSO-based algorithm, namely HUIM-BPSO, is proposed to efficiently find HUIs. The designed HUIM-BPSO algorithm finds the high-transaction-weighted utilization 1-itemsets (1-HTWUIs) as the size of the particles based on transaction-weighted utility (TWU) model, which can greatly reduce the combinational problem in evolution process. The sigmoid function is adopted in the updating process of the particles for the designed HUIM-BPSO algorithm. An OR/NOR-tree structure is further developed to reduce the invalid combinations for discovering HUIs. Substantial experiments on real-life datasets show that the proposed algorithm outperforms the other heuristic algorithms for mining HUIs in terms of execution time, number of discovered HUIs, and convergence.
Similar content being viewed by others
References
Agrawal S, Silakari S (2013) FRPSO: Fletcher-Reeves based particle swarm optimization for multimodal function optimization. Soft Comput 18(11):2227–2243
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. Int Conf Very Large Data Bases 1215:487–499
Ahmed CF, Tanbeer SK, Jeong BS, Le YK (2009) Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans Knowl Data Eng 21(12):1708–1721
Cattral R, Oppacher F, Graham KJL (2009) Techniques for evolutionary rule discovery in data mining. IEEE Congr Evolut Comput :1737–1744
Chan R, Yang Q, Shen YD (2003) Minging high utility itemsets. IEEE Int Conf Data Mining :19–26
Chen MS, Han J, Yu PS (1996) Data mining: an overview from a database perspective. IEEE Trans Knowl Data Eng 8(6):866–883
Fournier-Viger P, Wu CW, Zida S, Tseng VS (2014) FHM: Faster high-utility itemset mining using estimated utility co-occurrence pruning. Found Intell Syst 8502:83–92
Fournier-Viger P, Wu CW, Tseng VS (2014) Novel concise representations of high utility itemsets using generator patterns. Adv Data Mining Appl 8933:30–43
Fournier-Viger P, Zida S (2015) FOSHU: faster on-shelf high utility itemsets mining with or without negative unit profit. ACM Symp Appl Comput :857–864
Frequent itemset mining dataset repository (2012). http://fimi.ua.ac.be/data/
Gong W, Cai Z, Ling CX (2010) DE/BBO: a hybrid differential evolution with biogeography-based optimization for global numerical optimization. Soft Comput 15(4):645–665
Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min Knowl Disc 8(1):53–87
Holland J (1975) Adaptation in Natural and Artificial Systems, Cambridge. MIT Press, USA
Kannimuthu S, Premalatha K (2014) Discovery of high utility itemsets using genetic algorithm with ranked mutation. Appl Artif Intell 28(4):337–359
Kennedy J, Eberhart R (1997) A discrete binary version of particle swarm algorithm. IEEE Int Conf Syst Man Cybern 5:4104–4108
Kennedy J, Eberhart R (1995) Particle swarm optimization. IEEE Int Conf Neural Netw 4:1942–1948
Kuo RJ, Chao CM, Chiu YT (2011) Application of particle swarm optimization to association rule mining. Appl Soft Comput 11(1):326336
Lan GC, Hong TP, Tseng VS (2013) An efficient projection-based indexing approach for mining high utility itemsets. Knowl Inf Syst 38(1):85–107
Li XT, Yin MH (2015) A particle swarm inspired cuckoo search algorithm for real parameter optimization. Soft Comput :1–25
Liang XL, Li WF, Zhang Y, Zhou MC (2014) An adaptive particle swarm optimization method based on clustering. Soft Comput 19(2):431–448
Lin CW, Gan WS, Fournier-Viger P, Hong TP (2015) Mining high-utility itemsets with multiple minimum utility thresholds. Int C* Conf Comput Sci Softw Eng :9–17
Lin JCW, Yang L, Fournier-Viger P, Wu MT, Hong TP, Wang LSL (2015) A Swarm-based approach to mine high-utility itemsets. Multidiscip Int Soc Netw Conf
Lin CW, Hong TP, Lu WH (2011) An effective tree structure for mining high utility itemsets. Expert Syst Appl 38(6):7419–7424
Liu Y, Liao WK, Choudhary A (2005) A two-phase algorithm for fast discovery of high utility itemsets. Lecture Notes Comput Sci :689–695
Liu M, Qu J (2012) Mining high utility itemsets without candidate generation. ACM Int Conf Inf Knowl Manag :55–64
Martnez-Ballesteros M, Martnez-lvarez F, Riquelme JC (2010) Mining quantitative association rules based on evolutionary computation and its application to atmospheric pollution. Integr Comput Aided Eng 17(3):227–242
Menhas MI, Fei M, Wang L, Fu X (2011) A novel hybrid binary PSO algorithm. Lect Notes Comput Sci 6728:93–100
Microsoft (1996) Example database foodmart of Microsoft analysis services. http://msdn.microsoft.com/en-us/library/aa217032(SQL.80).aspx
Nouaouria N, Boukadouma M, Proulx R (2013) Particle swarm classification: a survey and positioning. Pattern Recogn 46(7):20282044
Pears R, Koh YS (2012) Weighted association rule mining using particle swarm pptimization. Lect Notes Comput Sci 7104:327–338
Salleb-Aouissi A, Vrain C, Nortet C (2007) QuantMiner: a genetic algorithm for mining quantitative association rules. Int Jt Conf Artif Intell 7:1035–1040
Sarath KNVD, Ravi V (2013) Association rule mining using binary particle swarm optimization. Eng Appl Artif Intell 26:1832–1840
Storn R, Price K (1997) Differential evolution–a simple and efficient heuristic for global optimization over continuous spaces. J Global Optim 11(4):341–359
Tsai CW, Huang KW, Yang CS, Chiang MC (2015) A fast particle swarm optimization for clustering. Soft Comput 19(2):321–338
Tseng VS, Wu CW, Shie BE, Yu PS (2010) UP-growth: an efficient algorithm for high utility itemset mining. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 253–262
Wu CW, Shie BE, Tseng VS, Yu PS (2012) Mining top-k high utility itemsets. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 78–86
Yao H, Hamilton HJ, Butz CJ (2004) A foundational approach to mining itemset utilities from databases. SIAM Int Conf Data Mining 4:211–225
Yao H, Hamilton HJ (2006) Mining itemset utilities from transaction databases. Data Knowl Eng 59(3):603–626
Yen SJ, Lee YS (2007) Mining high utility quantitative association rules. Lect Notes Comput Sci 4654:283–292
Zida S, Fournier-Viger P, Lin CW, Wu CW, Tseng VS (2015) EFIM: a highly efficient algorithm for high-utility itemset mining. In: Mexican International Conference on Artificial Intelligence
Zihayat M, An A (2014) Mining top-k high utility patterns over data streams. Inf Sci 285:138–161
Acknowledgments
This research was partially supported by the Shenzhen Peacock Project, China, under Grant KQC201109020055A, by the National Natural Science Foundation of China (NSFC) under Grant No. 61503092, by the Natural Scientific Research Innovation Foundation in Harbin Institute of Technology under Grant HIT.NSRIF.2014100, and by the Shenzhen Strategic Emerging Industries Program under Grant ZDSY20120613125016389.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that there are no conflicts of interest in this paper.
Ethical approval
This article does not contain any studies with human participants performed by any of the authors.
Additional information
Communicated by V. Loia.
Rights and permissions
About this article
Cite this article
Lin, J.CW., Yang, L., Fournier-Viger, P. et al. A binary PSO approach to mine high-utility itemsets. Soft Comput 21, 5103–5121 (2017). https://doi.org/10.1007/s00500-016-2106-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-016-2106-1