Abstract
Selecting a subset of candidate features is one of the important steps in the data mining process. The ultimate goal of feature selection is to select an optimal number of high-quality features that can maximize the performance of the learning algorithm. However, this problem becomes challenging when the number of features increases in a dataset. Hence, advanced optimization techniques are used these days to search for the optimal feature combinations. Whale Optimization Algorithm (WOA) is a recent metaheuristic that has successfully applied to different optimization problems. In this work, we propose a new variant of WOA (SBWOA) based on spatial bounding strategy to play the role of finding the potential features from the high-dimensional feature space. Also, a simplified version of SBWOA is introduced in an attempt to maintain a low computational complexity. The effectiveness of the proposed approach was validated on 16 high-dimensional datasets gathered from Arizona State University, and the results are compared with the other eight state-of-the-art feature selection methods. Among the competitors, SBWOA has achieved the highest accuracy for most datasets such as TOX_171, Colon, and Prostate_GE. The results obtained demonstrate the supremacy of the proposed approaches over the comparison methods.
Similar content being viewed by others
Data availability
Available at http://featureselection.asu.edu/datasets.php.
References
Nguyen BH, Xue B, Zhang M (2020) A survey on swarm intelligence approaches to feature selection in data mining. Swarm Evol Comput 54:100663. https://doi.org/10.1016/j.swevo.2020.100663
Zhang L, Shan L, Wang J (2017) Optimal feature selection using distance-based discrete firefly algorithm with mutual information criterion. Neural Comput Appl 28:2795–2808. https://doi.org/10.1007/s00521-016-2204-0
Ma B, Xia Y (2017) A tribe competition-based genetic algorithm for feature selection in pattern classification. Appl Soft Comput 58:328–338. https://doi.org/10.1016/j.asoc.2017.04.042
Jiang S, Chin K-S, Wang L et al (2017) Modified genetic algorithm-based feature selection combined with pre-trained deep neural network for demand forecasting in outpatient department. Expert Syst Appl 82:216–230. https://doi.org/10.1016/j.eswa.2017.04.017
Labani M, Moradi P, Ahmadizar F, Jalili M (2018) A novel multivariate filter method for feature selection in text classification problems. Eng Appl Artif Intell 70:25–37. https://doi.org/10.1016/j.engappai.2017.12.014
Wang D, Zhang H, Liu R et al (2014) t-Test feature selection approach based on term frequency for text categorization. Pattern Recognit Lett 45:1–10. https://doi.org/10.1016/j.patrec.2014.02.013
Gao W, Hu L, Zhang P, He J (2018) Feature selection considering the composition of feature relevancy. Pattern Recognit Lett 112:70–74. https://doi.org/10.1016/j.patrec.2018.06.005
Peng H, Long F, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27:1226–1238. https://doi.org/10.1109/TPAMI.2005.159
Hammouri AI, Mafarja M, Al-Betar MA et al (2020) An improved dragonfly algorithm for feature selection. Knowl-Based Syst 203:106131. https://doi.org/10.1016/j.knosys.2020.106131
Faris H, Hassonah MA, Al-Zoubi AM et al (2017) A multi-verse optimizer approach for feature selection and optimizing SVM parameters based on a robust system architecture. Neural Comput Appl. https://doi.org/10.1007/s00521-016-2818-2
Kennedy J (2011) Particle swarm optimization. Encyclopedia of machine learning. Springer, Boston, MA, pp 760–766
Storn R, Price K (1997) Differential evolution – a simple and efficient heuristic for global optimization over continuous spaces. J Glob Optim 11:341–359. https://doi.org/10.1023/A:1008202821328
Holland JH (1992) Genetic algorithms. Sci Am 267:66–73
Yang X-S (2010) A new metaheuristic bat-inspired algorithm. Nature inspired cooperative strategies for optimization (NICSO 2010). Springer, Berlin, Heidelberg, pp 65–74
Dorigo M, Birattari M (2011) Ant colony optimization. Encyclopedia of machine learning. Springer, Boston, MA, pp 36–39
Karaboga D, Basturk B (2007) A powerful and efficient algorithm for numerical function optimization: artificial bee colony (ABC) algorithm. J Glob Optim 39:459–471. https://doi.org/10.1007/s10898-007-9149-x
Rashedi E, Nezamabadi-pour H, Saryazdi S (2009) GSA: a gravitational search algorithm. Inf Sci 179:2232–2248. https://doi.org/10.1016/j.ins.2009.03.004
Mirjalili S, Lewis A (2016) The whale optimization algorithm. Adv Eng Softw 95:51–67. https://doi.org/10.1016/j.advengsoft.2016.01.008
Mafarja MM, Mirjalili S (2017) Hybrid whale optimization algorithm with simulated annealing for feature selection. Neurocomputing 260:302–312. https://doi.org/10.1016/j.neucom.2017.04.053
Abd Elaziz M, Oliva D (2018) Parameter estimation of solar cells diode models by an improved opposition-based whale optimization algorithm. Energy Convers Manag 171:1843–1859. https://doi.org/10.1016/j.enconman.2018.05.062
Aljarah I, Faris H, Mirjalili S (2018) Optimizing connection weights in neural networks using the whale optimization algorithm. Soft Comput 22:1–15. https://doi.org/10.1007/s00500-016-2442-1
Sun W, Zhang C (2018) Analysis and forecasting of the carbon price using multi—resolution singular value decomposition and extreme learning machine optimized by adaptive whale optimization algorithm. Appl Energy 231:1354–1371. https://doi.org/10.1016/j.apenergy.2018.09.118
Abdel-Basset M, Abdle-Fatah L, Sangaiah AK (2018) An improved Lévy based whale optimization algorithm for bandwidth-efficient virtual machine placement in cloud computing environment. Clust Comput. https://doi.org/10.1007/s10586-018-1769-z
Sharawi M, Zawbaa HM, Emary E, et al (2017) Feature selection approach based on whale optimization algorithm. In: 2017 Ninth International Conference on Advanced Computational Intelligence (ICACI). pp 163–168
Faramarzi A, Heidarinejad M, Mirjalili S, Gandomi AH (2020) Marine predators algorithm: a nature-inspired metaheuristic. Expert Syst Appl 152:113377. https://doi.org/10.1016/j.eswa.2020.113377
Barani F, Mirhosseini M, Nezamabadi-pour H (2017) Application of binary quantum-inspired gravitational search algorithm in feature subset selection. Appl Intell 47:304–318. https://doi.org/10.1007/s10489-017-0894-3
Neggaz N, Houssein EH, Hussain K (2020) An efficient henry gas solubility optimization for feature selection. Expert Syst Appl 152:113364. https://doi.org/10.1016/j.eswa.2020.113364
Siedlecki W, Sklansky J (1989) A note on genetic algorithms for large-scale feature selection. Pattern Recognit Lett 10:335–347. https://doi.org/10.1016/0167-8655(89)90037-8
Huang C-L, Wang C-J (2006) A GA-based feature selection and parameters optimizationfor support vector machines. Expert Syst Appl 31:231–240. https://doi.org/10.1016/j.eswa.2005.09.024
Nemati S, Basiri ME, Ghasem-Aghaee N, Aghdam MH (2009) A novel ACO–GA hybrid algorithm for feature selection in protein function prediction. Expert Syst Appl 36:12086–12094. https://doi.org/10.1016/j.eswa.2009.04.023
De Stefano C, Fontanella F, Marrocco C, Scotto di Freca A (2014) A GA-based feature selection approach with an application to handwritten character recognition. Pattern Recognit Lett 35:130–141. https://doi.org/10.1016/j.patrec.2013.01.026
Rejer I (2015) Genetic algorithm with aggressive mutation for feature selection in BCI feature space. Pattern Anal Appl 18:485–492. https://doi.org/10.1007/s10044-014-0425-3
Chuang L-Y, Chang H-W, Tu C-J, Yang C-H (2008) Improved binary PSO for feature selection using gene expression data. Comput Biol Chem 32:29–38. https://doi.org/10.1016/j.compbiolchem.2007.09.005
Unler A, Murat A (2010) A discrete particle swarm optimization method for feature selection in binary classification problems. Eur J Oper Res 206:528–539. https://doi.org/10.1016/j.ejor.2010.02.032
Chuang L-Y, Tsai S-W, Yang C-H (2011) Improved binary particle swarm optimization using catfish effect for feature selection. Expert Syst Appl 38:12699–12707. https://doi.org/10.1016/j.eswa.2011.04.057
Tan TY, Zhang L, Neoh SC, Lim CP (2018) Intelligent skin cancer detection using enhanced particle swarm optimization. Knowl-Based Syst 158:118–135. https://doi.org/10.1016/j.knosys.2018.05.042
Too J, Abdullah AR, Mohd Saad N (2019) A new co-evolution binary particle swarm optimization with multiple inertia weight strategy for feature selection. Informatics 6:21. https://doi.org/10.3390/informatics6020021
Banka H, Dara S (2015) A Hamming distance based binary particle swarm optimization (HDBPSO) algorithm for high dimensional feature selection, classification and validation. Pattern Recognit Lett 52:94–100. https://doi.org/10.1016/j.patrec.2014.10.007
Ji B, Lu X, Sun G et al (2020) Bio-inspired feature selection: an improved binary particle swarm optimization approach. IEEE Access 8:85989–86002. https://doi.org/10.1109/ACCESS.2020.2992752
Xue Y, Tang T, Pang W, Liu AX (2020) Self-adaptive parameter and strategy based particle swarm optimization for large-scale feature selection problems with multiple classifiers. Appl Soft Comput 88:106031. https://doi.org/10.1016/j.asoc.2019.106031
Mirjalili S, Gandomi AH, Mirjalili SZ et al (2017) Salp swarm algorithm: a bio-inspired optimizer for engineering design problems. Adv Eng Softw 114:163–191. https://doi.org/10.1016/j.advengsoft.2017.07.002
Sayed GI, Khoriba G, Haggag MH (2018) A novel chaotic salp swarm algorithm for global optimization and feature selection. Appl Intell. https://doi.org/10.1007/s10489-018-1158-6
Kaur T, Saini BS, Gupta S (2018) A novel feature selection method for brain tumor MR image classification based on the Fisher criterion and parameter-free Bat optimization. Neural Comput Appl 29:193–206. https://doi.org/10.1007/s00521-017-2869-z
Atashpaz-Gargari E, Reis MS, Braga-Neto UM et al (2018) A fast branch-and-bound algorithm for u-curve feature selection. Pattern Recognit 73:172–188. https://doi.org/10.1016/j.patcog.2017.08.013
Sindhu R, Ngadiran R, Yacob YM et al (2017) Sine–cosine algorithm for feature selection with elitism strategy and new updating mechanism. Neural Comput Appl 28:2947–2958. https://doi.org/10.1007/s00521-017-2837-7
Gu S, Cheng R, Jin Y (2018) Feature selection for high-dimensional classification using a competitive swarm optimizer. Soft Comput 22:811–822. https://doi.org/10.1007/s00500-016-2385-6
Zawbaa HM, Emary E, Grosan C (2016) feature selection via chaotic antlion optimization. PLoS ONE 11:e0150652. https://doi.org/10.1371/journal.pone.0150652
Mirjalili S (2015) The ant lion optimizer. Adv Eng Softw 83:80–98. https://doi.org/10.1016/j.advengsoft.2015.01.010
Mafarja M, Aljarah I, Heidari AA et al (2018) Binary dragonfly optimization for feature selection using time-varying transfer functions. Knowl-Based Syst 161:185–204. https://doi.org/10.1016/j.knosys.2018.08.003
Tu Q, Chen X, Liu X (2019) Multi-strategy ensemble grey wolf optimizer and its application to feature selection. Appl Soft Comput 76:16–30. https://doi.org/10.1016/j.asoc.2018.11.047
Too J, Abdullah AR (2020) Opposition based competitive grey wolf optimizer for EMG feature selection. Evol Intell. https://doi.org/10.1007/s12065-020-00441-5
Gholami J, Pourpanah F, Wang X (2020) Feature selection based on improved binary global harmony search for data classification. Appl Soft Comput 93:106402. https://doi.org/10.1016/j.asoc.2020.106402
Saremi S, Mirjalili S, Lewis A (2017) Grasshopper optimisation algorithm: theory and application. Adv Eng Softw 105:30–47. https://doi.org/10.1016/j.advengsoft.2017.01.004
Zakeri A, Hokmabadi A (2019) Efficient feature selection method using real-valued grasshopper optimization algorithm. Expert Syst Appl 119:61–72. https://doi.org/10.1016/j.eswa.2018.10.021
Sayed GI, Hassanien AE (2017) Moth-flame swarm optimization with neutrosophic sets for automatic mitosis detection in breast cancer histology images. Appl Intell 47:397–408. https://doi.org/10.1007/s10489-017-0897-0
AbdEl-Fattah Sayed S, Nabil E, Badr A (2016) A binary clonal flower pollination algorithm for feature selection. Pattern Recognit Lett 77:21–27. https://doi.org/10.1016/j.patrec.2016.03.014
Chen Y-P, Li Y, Wang G et al (2017) A novel bacterial foraging optimization algorithm for feature selection. Expert Syst Appl 83:1–17. https://doi.org/10.1016/j.eswa.2017.04.019
Tubishat M, Abushariah MAM, Idris N, Aljarah I (2019) Improved whale optimization algorithm for feature selection in Arabic sentiment analysis. Appl Intell 49:1688–1707. https://doi.org/10.1007/s10489-018-1334-8
Sun Y, Wang X, Chen Y, Liu Z (2018) A modified whale optimization algorithm for large-scale global optimization problems. Expert Syst Appl 114:563–577. https://doi.org/10.1016/j.eswa.2018.08.027
Ling Y, Zhou Y, Luo Q (2017) Levy flight trajectory-based whale optimization algorithm for global optimization. IEEE Access 5:6168–6186. https://doi.org/10.1109/ACCESS.2017.2695498
Kılıç H, Yüzgeç U (2019) Tournament selection based antlion optimization algorithm for solving quadratic assignment problem. Eng Sci Technol Int J 22:673–691. https://doi.org/10.1016/j.jestch.2018.11.013
Gunasundari S, Janakiraman S, Meenambal S (2016) Velocity bounded boolean particle swarm optimization for improved feature selection in liver and kidney disease diagnosis. Expert Syst Appl 56:28–47. https://doi.org/10.1016/j.eswa.2016.02.042
Zheng Y, Zhang B (2015) A simplified water wave optimization algorithm. In: 2015 IEEE Congress on Evolutionary Computation (CEC). pp 807–813
Xue B, Zhang M, Browne WN (2013) Particle swarm optimization for feature selection in classification: a multi-objective approach. IEEE Trans Cybern 43:1656–1671. https://doi.org/10.1109/TSMCB.2012.2227469
Emary E, Zawbaa HM (2018) Feature selection via Lèvy Antlion optimization. Pattern Anal Appl. https://doi.org/10.1007/s10044-018-0695-2
Xue B, Zhang M, Browne WN (2014) Particle swarm optimisation for feature selection in classification: novel initialisation and updating mechanisms. Appl Soft Comput 18:261–276. https://doi.org/10.1016/j.asoc.2013.09.018
Alweshah M, Khalaileh SA, Gupta BB et al (2020) The monarch butterfly optimization algorithm for solving feature selection problems. Neural Comput Appl. https://doi.org/10.1007/s00521-020-05210-0
Hegazy AhE, Makhlouf MA, El-Tawel GhS (2020) Improved salp swarm algorithm for feature selection. J King Saud Univ Comput Inf Sci 32:335–344. https://doi.org/10.1016/j.jksuci.2018.06.003
El-Hasnony IM, Barakat SI, Elhoseny M, Mostafa RR (2020) Improved feature selection model for big data analytics. IEEE Access 8:66989–67004. https://doi.org/10.1109/ACCESS.2020.2986232
Datasets | Feature Selection @ ASU. http://featureselection.asu.edu/datasets.php. Accessed from 3 Oct 2019
Rao R (2016) Jaya: A simple and new optimization algorithm for solving constrained and unconstrained optimization problems. Int J Ind Eng Comput 7:19–34
Mirjalili S (2015) Moth-flame optimization algorithm: a novel nature-inspired heuristic paradigm. Knowl-Based Syst 89:228–249. https://doi.org/10.1016/j.knosys.2015.07.006
Askarzadeh A (2016) A novel metaheuristic method for solving constrained engineering optimization problems: crow search algorithm. Comput Struct 169:1–12. https://doi.org/10.1016/j.compstruc.2016.03.001
Zhang Y, Jin Z, Mirjalili S (2020) Generalized normal distribution optimization and its applications in parameter extraction of photovoltaic models. Energy Convers Manag 224:113301. https://doi.org/10.1016/j.enconman.2020.113301
Cheng R, Jin Y (2015) A competitive swarm optimizer for large scale optimization. IEEE Trans Cybern 45:191–204. https://doi.org/10.1109/TCYB.2014.2322602
Funding
No funding was received for conducting this study.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
Authors declare they have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Too, J., Mafarja, M. & Mirjalili, S. Spatial bound whale optimization algorithm: an efficient high-dimensional feature selection approach. Neural Comput & Applic 33, 16229–16250 (2021). https://doi.org/10.1007/s00521-021-06224-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-021-06224-y