Skip to main content

Advertisement

Log in

A binary PSO approach to mine high-utility itemsets

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

High-utility itemset mining (HUIM) is a critical issue in recent years since it can be used to reveal the profitable products by considering both the quantity and profit factors instead of frequent itemset mining (FIM) or association-rule mining (ARM). Several algorithms have been presented to mine high-utility itemsets (HUIs) and most of them have to handle the exponential search space for discovering HUIs when the number of distinct items and the size of database are very large. In the past, a heuristic HUPE\( _\mathrm{umu}\)-GRAM algorithm was proposed to mine HUIs based on genetic algorithm (GA). For the evolutionary computation (EC) techniques of particle swarm optimization (PSO), it only requires fewer parameters compared to the GA-based approaches. Since the traditional PSO mechanism is used to handle the continuous problem, in this paper, the discrete PSO is adopted to encode the particles as the binary variables. An efficient PSO-based algorithm, namely HUIM-BPSO, is proposed to efficiently find HUIs. The designed HUIM-BPSO algorithm finds the high-transaction-weighted utilization 1-itemsets (1-HTWUIs) as the size of the particles based on transaction-weighted utility (TWU) model, which can greatly reduce the combinational problem in evolution process. The sigmoid function is adopted in the updating process of the particles for the designed HUIM-BPSO algorithm. An OR/NOR-tree structure is further developed to reduce the invalid combinations for discovering HUIs. Substantial experiments on real-life datasets show that the proposed algorithm outperforms the other heuristic algorithms for mining HUIs in terms of execution time, number of discovered HUIs, and convergence.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  • Agrawal S, Silakari S (2013) FRPSO: Fletcher-Reeves based particle swarm optimization for multimodal function optimization. Soft Comput 18(11):2227–2243

    Article  Google Scholar 

  • Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. Int Conf Very Large Data Bases 1215:487–499

  • Ahmed CF, Tanbeer SK, Jeong BS, Le YK (2009) Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans Knowl Data Eng 21(12):1708–1721

    Article  Google Scholar 

  • Cattral R, Oppacher F, Graham KJL (2009) Techniques for evolutionary rule discovery in data mining. IEEE Congr Evolut Comput :1737–1744

  • Chan R, Yang Q, Shen YD (2003) Minging high utility itemsets. IEEE Int Conf Data Mining :19–26

  • Chen MS, Han J, Yu PS (1996) Data mining: an overview from a database perspective. IEEE Trans Knowl Data Eng 8(6):866–883

    Article  Google Scholar 

  • Fournier-Viger P, Wu CW, Zida S, Tseng VS (2014) FHM: Faster high-utility itemset mining using estimated utility co-occurrence pruning. Found Intell Syst 8502:83–92

    Google Scholar 

  • Fournier-Viger P, Wu CW, Tseng VS (2014) Novel concise representations of high utility itemsets using generator patterns. Adv Data Mining Appl 8933:30–43

    Google Scholar 

  • Fournier-Viger P, Zida S (2015) FOSHU: faster on-shelf high utility itemsets mining with or without negative unit profit. ACM Symp Appl Comput :857–864

  • Frequent itemset mining dataset repository (2012). http://fimi.ua.ac.be/data/

  • Gong W, Cai Z, Ling CX (2010) DE/BBO: a hybrid differential evolution with biogeography-based optimization for global numerical optimization. Soft Comput 15(4):645–665

    Article  Google Scholar 

  • Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min Knowl Disc 8(1):53–87

    Article  MathSciNet  Google Scholar 

  • Holland J (1975) Adaptation in Natural and Artificial Systems, Cambridge. MIT Press, USA

    Google Scholar 

  • Kannimuthu S, Premalatha K (2014) Discovery of high utility itemsets using genetic algorithm with ranked mutation. Appl Artif Intell 28(4):337–359

    Article  Google Scholar 

  • Kennedy J, Eberhart R (1997) A discrete binary version of particle swarm algorithm. IEEE Int Conf Syst Man Cybern 5:4104–4108

  • Kennedy J, Eberhart R (1995) Particle swarm optimization. IEEE Int Conf Neural Netw 4:1942–1948

    Google Scholar 

  • Kuo RJ, Chao CM, Chiu YT (2011) Application of particle swarm optimization to association rule mining. Appl Soft Comput 11(1):326336

    Google Scholar 

  • Lan GC, Hong TP, Tseng VS (2013) An efficient projection-based indexing approach for mining high utility itemsets. Knowl Inf Syst 38(1):85–107

    Article  Google Scholar 

  • Li XT, Yin MH (2015) A particle swarm inspired cuckoo search algorithm for real parameter optimization. Soft Comput :1–25

  • Liang XL, Li WF, Zhang Y, Zhou MC (2014) An adaptive particle swarm optimization method based on clustering. Soft Comput 19(2):431–448

    Article  Google Scholar 

  • Lin CW, Gan WS, Fournier-Viger P, Hong TP (2015) Mining high-utility itemsets with multiple minimum utility thresholds. Int C* Conf Comput Sci Softw Eng :9–17

  • Lin JCW, Yang L, Fournier-Viger P, Wu MT, Hong TP, Wang LSL (2015) A Swarm-based approach to mine high-utility itemsets. Multidiscip Int Soc Netw Conf

  • Lin CW, Hong TP, Lu WH (2011) An effective tree structure for mining high utility itemsets. Expert Syst Appl 38(6):7419–7424

    Article  Google Scholar 

  • Liu Y, Liao WK, Choudhary A (2005) A two-phase algorithm for fast discovery of high utility itemsets. Lecture Notes Comput Sci :689–695

  • Liu M, Qu J (2012) Mining high utility itemsets without candidate generation. ACM Int Conf Inf Knowl Manag :55–64

  • Martnez-Ballesteros M, Martnez-lvarez F, Riquelme JC (2010) Mining quantitative association rules based on evolutionary computation and its application to atmospheric pollution. Integr Comput Aided Eng 17(3):227–242

    Google Scholar 

  • Menhas MI, Fei M, Wang L, Fu X (2011) A novel hybrid binary PSO algorithm. Lect Notes Comput Sci 6728:93–100

    Article  Google Scholar 

  • Microsoft (1996) Example database foodmart of Microsoft analysis services. http://msdn.microsoft.com/en-us/library/aa217032(SQL.80).aspx

  • Nouaouria N, Boukadouma M, Proulx R (2013) Particle swarm classification: a survey and positioning. Pattern Recogn 46(7):20282044

    Article  Google Scholar 

  • Pears R, Koh YS (2012) Weighted association rule mining using particle swarm pptimization. Lect Notes Comput Sci 7104:327–338

    Article  Google Scholar 

  • Salleb-Aouissi A, Vrain C, Nortet C (2007) QuantMiner: a genetic algorithm for mining quantitative association rules. Int Jt Conf Artif Intell 7:1035–1040

  • Sarath KNVD, Ravi V (2013) Association rule mining using binary particle swarm optimization. Eng Appl Artif Intell 26:1832–1840

    Article  Google Scholar 

  • Storn R, Price K (1997) Differential evolution–a simple and efficient heuristic for global optimization over continuous spaces. J Global Optim 11(4):341–359

    Article  MathSciNet  MATH  Google Scholar 

  • Tsai CW, Huang KW, Yang CS, Chiang MC (2015) A fast particle swarm optimization for clustering. Soft Comput 19(2):321–338

    Article  Google Scholar 

  • Tseng VS, Wu CW, Shie BE, Yu PS (2010) UP-growth: an efficient algorithm for high utility itemset mining. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 253–262

  • Wu CW, Shie BE, Tseng VS, Yu PS (2012) Mining top-k high utility itemsets. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 78–86

  • Yao H, Hamilton HJ, Butz CJ (2004) A foundational approach to mining itemset utilities from databases. SIAM Int Conf Data Mining 4:211–225

  • Yao H, Hamilton HJ (2006) Mining itemset utilities from transaction databases. Data Knowl Eng 59(3):603–626

    Article  Google Scholar 

  • Yen SJ, Lee YS (2007) Mining high utility quantitative association rules. Lect Notes Comput Sci 4654:283–292

    Article  Google Scholar 

  • Zida S, Fournier-Viger P, Lin CW, Wu CW, Tseng VS (2015) EFIM: a highly efficient algorithm for high-utility itemset mining. In: Mexican International Conference on Artificial Intelligence

  • Zihayat M, An A (2014) Mining top-k high utility patterns over data streams. Inf Sci 285:138–161

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

This research was partially supported by the Shenzhen Peacock Project, China, under Grant KQC201109020055A, by the National Natural Science Foundation of China (NSFC) under Grant No. 61503092, by the Natural Scientific Research Innovation Foundation in Harbin Institute of Technology under Grant HIT.NSRIF.2014100, and by the Shenzhen Strategic Emerging Industries Program under Grant ZDSY20120613125016389.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jerry Chun-Wei Lin.

Ethics declarations

Conflict of interest

The authors declare that there are no conflicts of interest in this paper.

Ethical approval

This article does not contain any studies with human participants performed by any of the authors.

Additional information

Communicated by V. Loia.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lin, J.CW., Yang, L., Fournier-Viger, P. et al. A binary PSO approach to mine high-utility itemsets. Soft Comput 21, 5103–5121 (2017). https://doi.org/10.1007/s00500-016-2106-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-016-2106-1

Keywords

Navigation