Abstract
Feature selection is an important technique of data processing in the field of machine learning and data mining. Its goal is to select the feature subset with the maximum classification accuracy and the minimum number. Using the particle swarm algorithm to find the optimal sunset in the high-dimensional data set is faced with the problems of falling into the local optimum and expensive calculation, resulting in a decrease in classification accuracy. Focused on these, this paper proposes a hybrid simplified PSO-based feature selection algorithm with the elite strategy (HECSPSO). It has the following improvements: (1) In the stage of population initialization, according to the separation performance of features, the conditional separation probability (\(S_{-}\mathrm{{probability}}\)) of features is redefined, on the basis of which a new population initialization strategy is proposed. (2) In order to further improve the convergence speed of the algorithm, this paper proposes the addition and deletion criterion of maximum separation-minimum redundancy according to the separation and redundancy of features, which is called elite strategy. (3) In order to simplify the complexity of the model, a simplified particle swarm optimization algorithm is proposed. The evolution process is controlled only by the position, which simplifies the iterative process of particle swarm and avoids the problems of slow convergence and low precision caused by particle velocity. (4) In order to avoid the algorithm falling into local optimum, the chaotic mechanism is used as the local search operator near the known solutions. In order to make a comprehensive evaluation, the proposed method is compared with other algorithms based on particle swarm optimization. On the 16 data sets of UCI (University of California Irvine Machine Learning Repository), these methods are compared and evaluated in three aspects: the classification accuracy, the selected feature subset size, and the number of iterations for the algorithm convergence. The results show that the proposed algorithm can achieve a feature subset with better performance, and is a highly competitive algorithm for feature selection.











Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
The data that support the findings of this study is openly available in UCI, refer to reference [34].
References
Bolón-Canedo V, Alonso-Betanzos A (2019) Ensembles for feature selection: a review and future trends. Inf Fusion 52:1–12. https://doi.org/10.1016/j.inffus.2018.11.008
Thabtah F, Kamalov F, Hammoud S et al (2020) Least Loss: a simplified filter method for feature selection. Inf Sci 534:1–15. https://doi.org/10.1016/j.ins.2020.05.017
Mafarja M, Mirjalili S (2018) Whale optimization approaches for wrapper feature selection. Appl Soft Comput 2019:365–373. https://doi.org/10.1016/j.asoc.2017.11.006
Venkatesh B, Anuradha J (2019) A hybrid feature selection approach for handling a high-dimensional data. Innovations in computer science and engineering. Springer, Singapore, pp 365–373. https://doi.org/10.1007/978-981-13-7082-3_42
Zhang X, Shi Z, Liu X et al (2018) A hybrid feature selection algorithm for classification unbalanced data processsing. IEEE Int Conf Smart Internet of Things (SmartIoT). https://doi.org/10.1109/SmartIoT.2018.00055
Song XF, Zhang Y, Gong DW et al (2021) A fast hybrid feature selection based on correlation-guided clustering and particle swarm optimization for high-dimensional data. IEEE Trans Cybern. https://doi.org/10.1109/TCYB.2021.3061152
Mirjalili S (2019) Genetic algorithm. Evolutionary algorithms and neural networks. Springer, Cham, pp 43–55. https://doi.org/10.1007/978-3-319-93025-1_4
Arqub OA, Abo-Hammour Z (2014) Numerical solution of systems of second-order boundary value problems using continuous genetic algorithm. Inf Sci 279:396–415. https://doi.org/10.1016/j.ins.2014.03.128
Bansal JC (2019) Particle swarm optimization. Evolutionary and swarm intelligence algorithms. Springer, Cham, pp 11–23. https://doi.org/10.1007/978-3-319-91341-4_2
Pathan S, Panwar D (2020) A smart channel estimation approach for LTE systems using PSO algorithm. Ann Optim Theory Pract 3(3):1–13
Civicioglu P, Besdok E (2019) Bernstain-search differential evolution algorithm for numerical function optimization. Expert Syst Appl 138:112831. https://doi.org/10.1016/j.eswa.2019.112831
Uthayakumar J, Metawa N, Shankar K et al (2020) Financial crisis prediction model using ant colony optimization. Int J Inf Manag 50:538–556. https://doi.org/10.1016/j.ijinfomgt.2018.12.001
Zhou J, Yao X, Chan FTS et al (2019) An individual dependent multi-colony artificial bee colony algorithm. Inf Sci 485:114–140. https://doi.org/10.1016/j.ins.2019.02.014
Garg H (2019) A hybrid GSA-GA algorithm for constrained optimization problems. Inf Sci 478:499–523. https://doi.org/10.1016/j.ins.2018.11.041
Garg H (2015) A hybrid GA-GSA algorithm for optimizing the performance of an industrial system by utilizing uncertain data. IGI Global, London, pp 620–654
Garg H (2016) A hybrid PSO-GA algorithm for constrained optimization problems. Appl Math Comput 274:292–305. https://doi.org/10.1016/j.amc.2015.11.001
Solorio-Fernández S, Carrasco-Ochoa JA, Martínez-Trinidad JF (2016) Feature selection based on hybridization of genetic algorithm and particle swarm optimization. Neurocomputing 214:866–880. https://doi.org/10.1016/j.neucom.2016.07.026
Du SY, Liu ZG (2020) Hybridizing Particle Swarm Optimization with JADE for continuous optimization. Multimedia Tools Appl 79(7):4619–4636. https://doi.org/10.1007/s11042-019-08142-7
Kennedy J, Eberhart RC (1997) A discrete binary version of the particle swarm algorithm. IEEE Int Conf Syst Man Cybern 5:4104–4108. https://doi.org/10.1109/ICSMC.1997.637339
Song X, Zhang Y, Gong D et al (2021) Feature selection using bare-bones particle swarm optimization with mutual information. Pattern Recogn 112:107804. https://doi.org/10.1016/j.patcog.2020.107804
Hu Y, Zhang Y, Gong D (2021) Multi objective particle swarm optimization for feature selection with fuzzy cost. IEEE Trans Cybern 51(2):874–888. https://doi.org/10.1109/TCYB.2020.3015756
Tran B, Xue B, Zhang M (2019) Variable-length particle swarm optimization for feature selection on high-dimensional classification. IEEE Trans Evol Comput 23(3):473–487. https://doi.org/10.1109/TEVC.2018.2869405
Zhang Y, Li HG, Wang Q et al (2019) A filter-based bare-bone particle swarm optimization algorithm for unsupervised feature selection. Appl Intell 49(8):2889–2898. https://doi.org/10.1007/s10489-019-01420-9
Zhang Y, Zhang J, Guo Y et al (2016) Fuzzy cost-based feature selection using interval multi-objective particle swarm optimization algorithm. J Intell Fuzzy Syst 31(6):2807–2812. https://doi.org/10.3233/JIFS-169162
Saqlain SM, Sher M, Shah FA et al (2019) Fisher score and Matthews correlation coefficient-based feature subset selection for heart disease diagnosis using support vector machines. Knowl Inf Syst 58(1):139–167. https://doi.org/10.1007/s10115-018-1185-y
Che J, Yang Y, Li L et al (2017) Maximum relevance minimum common redundancy feature selection for nonlinear data. Inf Sci 409:68–86. https://doi.org/10.1016/j.ins.2017.05.013
Qi G, Hu J, Wang Z (2020) Modeling of a Hamiltonian conservative chaotic system and its mechanism routes from periodic to quasiperiodic, chaos and strong chaos. Appl Math Model 78:350–365. https://doi.org/10.1016/j.apm.2019.08.023
Akbarpour A, Zeynali MJ, Tahroudi MN (2020) Locating optimal position of pumping Wells in aquifer using meta-heuristic algorithms and finite element method. Water Resour Manag 34(1):21–34. https://doi.org/10.1007/s11269-019-02386-6
Deng Y (2016) Deng entropy. Chaos Solitons Fractals 91:549–553. https://doi.org/10.1016/j.chaos.2016.07.014
Gabrié M, Manoel A, Luneau C et al (2019) Entropy and mutual information in models of deep neural networks. J Stat Mech 2019(12):124014. https://doi.org/10.1088/1742-5468/ab3430
Cakir F, He K, Bargal SA et al (2019) Hashing with mutual information. IEEE Trans Pattern Anal Mach Intell 41(10):2424–2437. https://doi.org/10.1109/TPAMI.2019.2914897
Yin L, Xingfei M, Mengxi Y et al (2015) Improved feature selection based on normalized mutual information. Int Symp Distrib Comput Appl Bus Eng Sci (DCABES). https://doi.org/10.1109/SAINT.2010.50
Che J, Yang Y, Li L et al (2017) Maximum relevance minimum common redundancy feature selection for nonlinear data. Inf Sci 409:68–86. https://doi.org/10.1016/j.ins.2017.05.013
Kaleeswaran V, Dhamodharavadhani S, Rathipriya R (2021) Multi-crop selection model using binary particle swarm optimization. Innovative data communication technologies and application. Springer, Singapore, pp 57–68. https://doi.org/10.1007/978-981-15-9651-3_5
Chuang LY, Yang CH, Li JC (2011) Chaotic maps based on binary particle swarm optimization for feature selection. Appl Soft Comput 11(1):239–248. https://doi.org/10.1016/j.asoc.2009.11.014
Chen K, Zhou F, Liu A (2018) Chaotic dynamic weight particle swarm optimization for numerical function optimization. Knowl-Based Syst 139:23–40. https://doi.org/10.1016/j.knosys.2017.10.011
UCI database (2022). http://archive.ics.uci.edu/ml/datasets.php. Accessed 7 May 2020
Rezaee Jordehi A, Jasni J (2013) Parameter selection in particle swarm optimisation: a survey. J Exp Theoret Artif Intell 25(4):527–542. https://doi.org/10.1080/0952813X.2013.782348
Liu J, Mei Y, Li X (2015) An analysis of the inertia weight parameter for binary particle swarm optimization. IEEE Trans Evol Comput 20(5):666–681. https://doi.org/10.1109/TEVC.2015.2503422
Zhu H, Hu Y, Zhu W (2019) A dynamic adaptive particle swarm optimization and genetic algorithm for different constrained engineering design optimization problems. Adv Mech Eng 11(3):1687814018824930. https://doi.org/10.1177/1687814018824930
Bennasar M, Hicks Y, Setchi R (2015) Feature selection using joint mutual information maximisation. Expert Syst Appl 42(22):8520–8532. https://doi.org/10.1016/j.eswa.2015.07.007
Unler A, Murat A, Chinnam RB (2011) mr2PSO: a maximum relevance minimum redundancy feature selection method based on swarm intelligence for support vector machine classification. Inf Sci 181(20):4625–4641. https://doi.org/10.1016/j.ins.2010.05.037
Liu W, Wang Z, Zeng N et al (2021) A novel randomised particle swarm optimizer. Int J Mach Learn Cybern 12(2):529–540. https://doi.org/10.1007/s13042-020-01186-4
Abu Arqub O (2017) Adaptation of reproducing kernel algorithm for solving fuzzy Fredholm-Volterra integrodifferential equations. Neural Comput Appl 28(7):1591–1610. https://doi.org/10.1007/s00521-015-2110-x
Abu Arqub O, Singh J, Maayah B et al (2021) Reproducing kernel approach for numerical solutions of fuzzy fractional initial value problems under the Mittag-Leffler kernel differential operator. Math Methods Appl Sci. https://doi.org/10.1002/mma.7305
Abu Arqub O, Singh J, Alhodaly M (2021) Adaptation of kernel functions based approach with Atangana-Baleanu-Caputo distributed order derivative for solutions of fuzzy fractional Volterra and Fredholm integrodifferential equations. Math Methods Appl Sci. https://doi.org/10.1002/mma.7228
Wang X, Zhao Y, Pourpanah F (2020) Recent advances in deep learning. Int J Mach Learn Cybern 11(4):747–750. https://doi.org/10.1007/s13042-020-01096-5
Zhang K, Zhan J, Wang X (2020) TOPSIS-WAA method based on a covering-based fuzzy rough set: an application to rating problem. Inf Sci 539:397–421. https://doi.org/10.1016/j.ins.2020.06.009
Ni P, Zhao S, Wang X et al (2020) Incremental feature selection based on fuzzy rough sets. Inf Sci 536:185–204. https://doi.org/10.1016/j.ins.2020.04.038
Ehteram M, Salih SQ, Yaseen ZM (2020) Efficiency evaluation of reverse osmosis desalination plant using hybridized multilayer perceptron with particle swarm optimization. Environ Sci Pollut Res 27(13):15278–15291. https://doi.org/10.1007/s11356-020-08023-9
Feng Z, Niu W, Zhang R et al (2019) Operation rule derivation of hydropower reservoir by k-means clustering method and extreme learning machine based on particle swarm optimization. J Hydrol 576:229–238. https://doi.org/10.1016/j.jhydrol.2019.06.045
Taormina R, Chau KW (2015) Data-driven input variable selection for rainfall-runoff modeling using binary-coded particle swarm optimization and extreme learning machines. J Hydrol 529:1617–1632. https://doi.org/10.1016/j.jhydrol.2015.08.022
Cao W, Wang X, Ming Z et al (2018) A review on neural networks with random weights. Neurocomputing 275:278–287. https://doi.org/10.1016/j.neucom.2017.08.040
Taormina R, Chau KW (2015) ANN-based interval forecasting of streamflow discharges using the LUBE method and MOFIPS. Eng Appl Artif Intell 45:429–440. https://doi.org/10.1016/j.engappai.2015.07.019
Ali Ghorbani M, Kazempour R, Chau KW et al (2018) Forecasting pan evaporation with an integrated artificial neural network quantum-behaved particle swarm optimization model: a case study in Talesh, Northern Iran. Engineering Applications of Computational Fluid Mechanics 12(1):724–737. https://doi.org/10.1080/19942060.2018.1517052
Cheng C, Niu W, Feng Z et al (2015) Daily reservoir runoff forecasting method using artificial neural network based on quantum-behaved particle swarm optimization. Water 7(8):4232–4246. https://doi.org/10.3390/w7084232
Khan GA, Hu J, Li T et al (2022) Multi-view data clustering via non-negative matrix factorization with manifold regularization. Int J Mach Learn Cybern 13(3):677–689. https://doi.org/10.1007/s13042-021-01307-7
Abualigah L, Alsalibi B, Shehab M et al (2021) A parallel hybrid krill herd algorithm for feature selection. Int J Mach Learn Cybern 12(3):783–806. https://doi.org/10.1007/s13042-020-01202-7
Acknowledgements
The authors are grateful to the editor and reviewers for their valuable comments. This work is financially supported by the National Natural Science Foundation of China (61573266) and Natural Science Basic Research Program of Shaanxi(Program No.2021JM-133).
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Sun, L., Yang, Y., Liu, Y. et al. Feature selection based on a hybrid simplified particle swarm optimization algorithm with maximum separation and minimum redundancy. Int. J. Mach. Learn. & Cyber. 14, 789–816 (2023). https://doi.org/10.1007/s13042-022-01663-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13042-022-01663-y