Skip to main content
Log in

Feature selection based on a hybrid simplified particle swarm optimization algorithm with maximum separation and minimum redundancy

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Feature selection is an important technique of data processing in the field of machine learning and data mining. Its goal is to select the feature subset with the maximum classification accuracy and the minimum number. Using the particle swarm algorithm to find the optimal sunset in the high-dimensional data set is faced with the problems of falling into the local optimum and expensive calculation, resulting in a decrease in classification accuracy. Focused on these, this paper proposes a hybrid simplified PSO-based feature selection algorithm with the elite strategy (HECSPSO). It has the following improvements: (1) In the stage of population initialization, according to the separation performance of features, the conditional separation probability (\(S_{-}\mathrm{{probability}}\)) of features is redefined, on the basis of which a new population initialization strategy is proposed. (2) In order to further improve the convergence speed of the algorithm, this paper proposes the addition and deletion criterion of maximum separation-minimum redundancy according to the separation and redundancy of features, which is called elite strategy. (3) In order to simplify the complexity of the model, a simplified particle swarm optimization algorithm is proposed. The evolution process is controlled only by the position, which simplifies the iterative process of particle swarm and avoids the problems of slow convergence and low precision caused by particle velocity. (4) In order to avoid the algorithm falling into local optimum, the chaotic mechanism is used as the local search operator near the known solutions. In order to make a comprehensive evaluation, the proposed method is compared with other algorithms based on particle swarm optimization. On the 16 data sets of UCI (University of California Irvine Machine Learning Repository), these methods are compared and evaluated in three aspects: the classification accuracy, the selected feature subset size, and the number of iterations for the algorithm convergence. The results show that the proposed algorithm can achieve a feature subset with better performance, and is a highly competitive algorithm for feature selection.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Data availability

The data that support the findings of this study is openly available in UCI, refer to reference [34].

References

  1. Bolón-Canedo V, Alonso-Betanzos A (2019) Ensembles for feature selection: a review and future trends. Inf Fusion 52:1–12. https://doi.org/10.1016/j.inffus.2018.11.008

    Article  Google Scholar 

  2. Thabtah F, Kamalov F, Hammoud S et al (2020) Least Loss: a simplified filter method for feature selection. Inf Sci 534:1–15. https://doi.org/10.1016/j.ins.2020.05.017

    Article  MathSciNet  MATH  Google Scholar 

  3. Mafarja M, Mirjalili S (2018) Whale optimization approaches for wrapper feature selection. Appl Soft Comput 2019:365–373. https://doi.org/10.1016/j.asoc.2017.11.006

    Article  Google Scholar 

  4. Venkatesh B, Anuradha J (2019) A hybrid feature selection approach for handling a high-dimensional data. Innovations in computer science and engineering. Springer, Singapore, pp 365–373. https://doi.org/10.1007/978-981-13-7082-3_42

    Chapter  Google Scholar 

  5. Zhang X, Shi Z, Liu X et al (2018) A hybrid feature selection algorithm for classification unbalanced data processsing. IEEE Int Conf Smart Internet of Things (SmartIoT). https://doi.org/10.1109/SmartIoT.2018.00055

    Article  Google Scholar 

  6. Song XF, Zhang Y, Gong DW et al (2021) A fast hybrid feature selection based on correlation-guided clustering and particle swarm optimization for high-dimensional data. IEEE Trans Cybern. https://doi.org/10.1109/TCYB.2021.3061152

    Article  Google Scholar 

  7. Mirjalili S (2019) Genetic algorithm. Evolutionary algorithms and neural networks. Springer, Cham, pp 43–55. https://doi.org/10.1007/978-3-319-93025-1_4

    Chapter  MATH  Google Scholar 

  8. Arqub OA, Abo-Hammour Z (2014) Numerical solution of systems of second-order boundary value problems using continuous genetic algorithm. Inf Sci 279:396–415. https://doi.org/10.1016/j.ins.2014.03.128

    Article  MathSciNet  MATH  Google Scholar 

  9. Bansal JC (2019) Particle swarm optimization. Evolutionary and swarm intelligence algorithms. Springer, Cham, pp 11–23. https://doi.org/10.1007/978-3-319-91341-4_2

    Chapter  Google Scholar 

  10. Pathan S, Panwar D (2020) A smart channel estimation approach for LTE systems using PSO algorithm. Ann Optim Theory Pract 3(3):1–13

    Google Scholar 

  11. Civicioglu P, Besdok E (2019) Bernstain-search differential evolution algorithm for numerical function optimization. Expert Syst Appl 138:112831. https://doi.org/10.1016/j.eswa.2019.112831

    Article  Google Scholar 

  12. Uthayakumar J, Metawa N, Shankar K et al (2020) Financial crisis prediction model using ant colony optimization. Int J Inf Manag 50:538–556. https://doi.org/10.1016/j.ijinfomgt.2018.12.001

    Article  Google Scholar 

  13. Zhou J, Yao X, Chan FTS et al (2019) An individual dependent multi-colony artificial bee colony algorithm. Inf Sci 485:114–140. https://doi.org/10.1016/j.ins.2019.02.014

    Article  Google Scholar 

  14. Garg H (2019) A hybrid GSA-GA algorithm for constrained optimization problems. Inf Sci 478:499–523. https://doi.org/10.1016/j.ins.2018.11.041

    Article  Google Scholar 

  15. Garg H (2015) A hybrid GA-GSA algorithm for optimizing the performance of an industrial system by utilizing uncertain data. IGI Global, London, pp 620–654

    Google Scholar 

  16. Garg H (2016) A hybrid PSO-GA algorithm for constrained optimization problems. Appl Math Comput 274:292–305. https://doi.org/10.1016/j.amc.2015.11.001

    Article  MathSciNet  MATH  Google Scholar 

  17. Solorio-Fernández S, Carrasco-Ochoa JA, Martínez-Trinidad JF (2016) Feature selection based on hybridization of genetic algorithm and particle swarm optimization. Neurocomputing 214:866–880. https://doi.org/10.1016/j.neucom.2016.07.026

    Article  Google Scholar 

  18. Du SY, Liu ZG (2020) Hybridizing Particle Swarm Optimization with JADE for continuous optimization. Multimedia Tools Appl 79(7):4619–4636. https://doi.org/10.1007/s11042-019-08142-7

    Article  Google Scholar 

  19. Kennedy J, Eberhart RC (1997) A discrete binary version of the particle swarm algorithm. IEEE Int Conf Syst Man Cybern 5:4104–4108. https://doi.org/10.1109/ICSMC.1997.637339

    Article  Google Scholar 

  20. Song X, Zhang Y, Gong D et al (2021) Feature selection using bare-bones particle swarm optimization with mutual information. Pattern Recogn 112:107804. https://doi.org/10.1016/j.patcog.2020.107804

    Article  Google Scholar 

  21. Hu Y, Zhang Y, Gong D (2021) Multi objective particle swarm optimization for feature selection with fuzzy cost. IEEE Trans Cybern 51(2):874–888. https://doi.org/10.1109/TCYB.2020.3015756

    Article  Google Scholar 

  22. Tran B, Xue B, Zhang M (2019) Variable-length particle swarm optimization for feature selection on high-dimensional classification. IEEE Trans Evol Comput 23(3):473–487. https://doi.org/10.1109/TEVC.2018.2869405

    Article  Google Scholar 

  23. Zhang Y, Li HG, Wang Q et al (2019) A filter-based bare-bone particle swarm optimization algorithm for unsupervised feature selection. Appl Intell 49(8):2889–2898. https://doi.org/10.1007/s10489-019-01420-9

    Article  Google Scholar 

  24. Zhang Y, Zhang J, Guo Y et al (2016) Fuzzy cost-based feature selection using interval multi-objective particle swarm optimization algorithm. J Intell Fuzzy Syst 31(6):2807–2812. https://doi.org/10.3233/JIFS-169162

    Article  Google Scholar 

  25. Saqlain SM, Sher M, Shah FA et al (2019) Fisher score and Matthews correlation coefficient-based feature subset selection for heart disease diagnosis using support vector machines. Knowl Inf Syst 58(1):139–167. https://doi.org/10.1007/s10115-018-1185-y

    Article  Google Scholar 

  26. Che J, Yang Y, Li L et al (2017) Maximum relevance minimum common redundancy feature selection for nonlinear data. Inf Sci 409:68–86. https://doi.org/10.1016/j.ins.2017.05.013

    Article  MATH  Google Scholar 

  27. Qi G, Hu J, Wang Z (2020) Modeling of a Hamiltonian conservative chaotic system and its mechanism routes from periodic to quasiperiodic, chaos and strong chaos. Appl Math Model 78:350–365. https://doi.org/10.1016/j.apm.2019.08.023

    Article  MathSciNet  MATH  Google Scholar 

  28. Akbarpour A, Zeynali MJ, Tahroudi MN (2020) Locating optimal position of pumping Wells in aquifer using meta-heuristic algorithms and finite element method. Water Resour Manag 34(1):21–34. https://doi.org/10.1007/s11269-019-02386-6

    Article  Google Scholar 

  29. Deng Y (2016) Deng entropy. Chaos Solitons Fractals 91:549–553. https://doi.org/10.1016/j.chaos.2016.07.014

    Article  MATH  Google Scholar 

  30. Gabrié M, Manoel A, Luneau C et al (2019) Entropy and mutual information in models of deep neural networks. J Stat Mech 2019(12):124014. https://doi.org/10.1088/1742-5468/ab3430

    Article  MathSciNet  MATH  Google Scholar 

  31. Cakir F, He K, Bargal SA et al (2019) Hashing with mutual information. IEEE Trans Pattern Anal Mach Intell 41(10):2424–2437. https://doi.org/10.1109/TPAMI.2019.2914897

    Article  Google Scholar 

  32. Yin L, Xingfei M, Mengxi Y et al (2015) Improved feature selection based on normalized mutual information. Int Symp Distrib Comput Appl Bus Eng Sci (DCABES). https://doi.org/10.1109/SAINT.2010.50

    Article  Google Scholar 

  33. Che J, Yang Y, Li L et al (2017) Maximum relevance minimum common redundancy feature selection for nonlinear data. Inf Sci 409:68–86. https://doi.org/10.1016/j.ins.2017.05.013

    Article  MATH  Google Scholar 

  34. Kaleeswaran V, Dhamodharavadhani S, Rathipriya R (2021) Multi-crop selection model using binary particle swarm optimization. Innovative data communication technologies and application. Springer, Singapore, pp 57–68. https://doi.org/10.1007/978-981-15-9651-3_5

    Chapter  Google Scholar 

  35. Chuang LY, Yang CH, Li JC (2011) Chaotic maps based on binary particle swarm optimization for feature selection. Appl Soft Comput 11(1):239–248. https://doi.org/10.1016/j.asoc.2009.11.014

    Article  Google Scholar 

  36. Chen K, Zhou F, Liu A (2018) Chaotic dynamic weight particle swarm optimization for numerical function optimization. Knowl-Based Syst 139:23–40. https://doi.org/10.1016/j.knosys.2017.10.011

    Article  Google Scholar 

  37. UCI database (2022). http://archive.ics.uci.edu/ml/datasets.php. Accessed 7 May 2020

  38. Rezaee Jordehi A, Jasni J (2013) Parameter selection in particle swarm optimisation: a survey. J Exp Theoret Artif Intell 25(4):527–542. https://doi.org/10.1080/0952813X.2013.782348

    Article  Google Scholar 

  39. Liu J, Mei Y, Li X (2015) An analysis of the inertia weight parameter for binary particle swarm optimization. IEEE Trans Evol Comput 20(5):666–681. https://doi.org/10.1109/TEVC.2015.2503422

    Article  Google Scholar 

  40. Zhu H, Hu Y, Zhu W (2019) A dynamic adaptive particle swarm optimization and genetic algorithm for different constrained engineering design optimization problems. Adv Mech Eng 11(3):1687814018824930. https://doi.org/10.1177/1687814018824930

    Article  Google Scholar 

  41. Bennasar M, Hicks Y, Setchi R (2015) Feature selection using joint mutual information maximisation. Expert Syst Appl 42(22):8520–8532. https://doi.org/10.1016/j.eswa.2015.07.007

    Article  Google Scholar 

  42. Unler A, Murat A, Chinnam RB (2011) mr2PSO: a maximum relevance minimum redundancy feature selection method based on swarm intelligence for support vector machine classification. Inf Sci 181(20):4625–4641. https://doi.org/10.1016/j.ins.2010.05.037

    Article  Google Scholar 

  43. Liu W, Wang Z, Zeng N et al (2021) A novel randomised particle swarm optimizer. Int J Mach Learn Cybern 12(2):529–540. https://doi.org/10.1007/s13042-020-01186-4

    Article  Google Scholar 

  44. Abu Arqub O (2017) Adaptation of reproducing kernel algorithm for solving fuzzy Fredholm-Volterra integrodifferential equations. Neural Comput Appl 28(7):1591–1610. https://doi.org/10.1007/s00521-015-2110-x

    Article  Google Scholar 

  45. Abu Arqub O, Singh J, Maayah B et al (2021) Reproducing kernel approach for numerical solutions of fuzzy fractional initial value problems under the Mittag-Leffler kernel differential operator. Math Methods Appl Sci. https://doi.org/10.1002/mma.7305

    Article  Google Scholar 

  46. Abu Arqub O, Singh J, Alhodaly M (2021) Adaptation of kernel functions based approach with Atangana-Baleanu-Caputo distributed order derivative for solutions of fuzzy fractional Volterra and Fredholm integrodifferential equations. Math Methods Appl Sci. https://doi.org/10.1002/mma.7228

    Article  Google Scholar 

  47. Wang X, Zhao Y, Pourpanah F (2020) Recent advances in deep learning. Int J Mach Learn Cybern 11(4):747–750. https://doi.org/10.1007/s13042-020-01096-5

    Article  Google Scholar 

  48. Zhang K, Zhan J, Wang X (2020) TOPSIS-WAA method based on a covering-based fuzzy rough set: an application to rating problem. Inf Sci 539:397–421. https://doi.org/10.1016/j.ins.2020.06.009

    Article  MathSciNet  Google Scholar 

  49. Ni P, Zhao S, Wang X et al (2020) Incremental feature selection based on fuzzy rough sets. Inf Sci 536:185–204. https://doi.org/10.1016/j.ins.2020.04.038

    Article  MathSciNet  MATH  Google Scholar 

  50. Ehteram M, Salih SQ, Yaseen ZM (2020) Efficiency evaluation of reverse osmosis desalination plant using hybridized multilayer perceptron with particle swarm optimization. Environ Sci Pollut Res 27(13):15278–15291. https://doi.org/10.1007/s11356-020-08023-9

    Article  Google Scholar 

  51. Feng Z, Niu W, Zhang R et al (2019) Operation rule derivation of hydropower reservoir by k-means clustering method and extreme learning machine based on particle swarm optimization. J Hydrol 576:229–238. https://doi.org/10.1016/j.jhydrol.2019.06.045

    Article  Google Scholar 

  52. Taormina R, Chau KW (2015) Data-driven input variable selection for rainfall-runoff modeling using binary-coded particle swarm optimization and extreme learning machines. J Hydrol 529:1617–1632. https://doi.org/10.1016/j.jhydrol.2015.08.022

    Article  Google Scholar 

  53. Cao W, Wang X, Ming Z et al (2018) A review on neural networks with random weights. Neurocomputing 275:278–287. https://doi.org/10.1016/j.neucom.2017.08.040

    Article  Google Scholar 

  54. Taormina R, Chau KW (2015) ANN-based interval forecasting of streamflow discharges using the LUBE method and MOFIPS. Eng Appl Artif Intell 45:429–440. https://doi.org/10.1016/j.engappai.2015.07.019

    Article  Google Scholar 

  55. Ali Ghorbani M, Kazempour R, Chau KW et al (2018) Forecasting pan evaporation with an integrated artificial neural network quantum-behaved particle swarm optimization model: a case study in Talesh, Northern Iran. Engineering Applications of Computational Fluid Mechanics 12(1):724–737. https://doi.org/10.1080/19942060.2018.1517052

    Article  Google Scholar 

  56. Cheng C, Niu W, Feng Z et al (2015) Daily reservoir runoff forecasting method using artificial neural network based on quantum-behaved particle swarm optimization. Water 7(8):4232–4246. https://doi.org/10.3390/w7084232

    Article  Google Scholar 

  57. Khan GA, Hu J, Li T et al (2022) Multi-view data clustering via non-negative matrix factorization with manifold regularization. Int J Mach Learn Cybern 13(3):677–689. https://doi.org/10.1007/s13042-021-01307-7

    Article  Google Scholar 

  58. Abualigah L, Alsalibi B, Shehab M et al (2021) A parallel hybrid krill herd algorithm for feature selection. Int J Mach Learn Cybern 12(3):783–806. https://doi.org/10.1007/s13042-020-01202-7

    Article  Google Scholar 

Download references

Acknowledgements

The authors are grateful to the editor and reviewers for their valuable comments. This work is financially supported by the National Natural Science Foundation of China (61573266) and Natural Science Basic Research Program of Shaanxi(Program No.2021JM-133).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Liqin Sun or Youlong Yang.

Ethics declarations

Conflict of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sun, L., Yang, Y., Liu, Y. et al. Feature selection based on a hybrid simplified particle swarm optimization algorithm with maximum separation and minimum redundancy. Int. J. Mach. Learn. & Cyber. 14, 789–816 (2023). https://doi.org/10.1007/s13042-022-01663-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-022-01663-y

Keywords

Navigation