Abstract
Feature selection has become popular in data mining tasks currently for its ability of improving the performance of the algorithm and gaining more information about the dataset. Although the firefly algorithm is a well-performed heuristic algorithm, there is still much room for improvement as to the feature selection problem. In this research, an improved firefly algorithm designed for feature selection with the ReliefF-based initialization method and the weighted voting mechanism is proposed. First of all, a feature grouping initialization method that combines the results of the ReliefF algorithm and the cosine similarity is designed to take place of random initialization. Then, the direction of the firefly is modified to move toward the optimal solution. Finally, inspired by the ensemble algorithm, a weighted voter is proposed to build recommended positions for fireflies, which is also integrated with the elite crossover operator and the mutation operator to improve the diversity of the population. Selected from the mixed swarm, a new population is constructed to replace the original population in the next stage. To verify the effectiveness of the algorithm proposed in this paper, 18 datasets are utilized and 9 comparison algorithms (e.g., Black Hole Algorithm, Grey Wolf Optimizer and Pigeon Inspired Optimizer) from state-of-the-art related works are selected for the simulating experiments. The experimental results demonstrate the superiority of the proposed algorithm applied to the feature selection problem.
Similar content being viewed by others
Data availability
The datasets used during the current study are available in the Kaggle (https://www.kaggle.com) and the UCI Repository (http://archive.ics.uci.edu/ml/index.php).
References
Alican D, Derya B (2021) Machine learning and data mining in manufacturing. Expert Syst Appl 166:114060. https://doi.org/10.1016/j.eswa.2020.114060
Jie C, Jiawei L, Shulin W, Sheng Y (2018) Feature selection in machine learning: a new perspective. Neurocomputing 300:70–79. https://doi.org/10.1016/j.neucom.2017.11.077
Mohammad T, Majdi MM, Ali AH, Hossam F, Ibrahim A, Seyedali M, Hamido F (2019) An evolutionary gravitational search-based feature selection. Inf Sci 497:219–239. https://doi.org/10.1016/j.ins.2019.05.038
Lei Y, Huan L (2014) Efficient feature selection via analysis of relevance and redundancy. J Mach Learn Res 5, 1205–1224 (2004). Springer Nature 2021 LATEX template Article Title 19 https://doi.org/10.1023/B:JODS.0000045365.56394.b4
Mehrdad R, Kamal B, Elahe N, Saman F (2021) Review of swarm intelligence-based feature selection methods. Eng Appl Artif Intell 100:104210. https://doi.org/10.1016/j.engappai.2021.104210
Gao W, Hu L, Zhang P, He J (2018) Feature selection considering the composition of feature relevancy. Pattern Recogn Lett 112:70–74. https://doi.org/10.1016/j.patrec.2018.06.005
Manoranjan D, Huan L (1997) Feature selection for classification. Intell Data Anal 1:1–4. https://doi.org/10.1016/S1088-467X(97)00008-5
Sankalap A, Priyanka A (2019) Binary butterfly optimization approaches for feature selection. Expert Syst Appl 116:147–160. https://doi.org/10.1016/j.eswa.2018.08.051
Ryan JU, Melissa M, William GLC, Randal SO, Jason HM (2018) Relief-based feature selection: introduction and review. J Biomed Inform 85:189–203. https://doi.org/10.1016/j.jbi.2018.07.014
Kira K, Rendell LA (1992) The feature selection problem: Traditional methods and a new algorithm. In: Proceedings of the Tenth National Conference on Artificial Intelligence, AAAI Press, San Jose, California, pp 129–134
Igor K (1994) Estimating attributes: analysis and extensions of RELIEF. Paper presented at the 94th European Conference on Machine Learning, Catania, Italy, 6–8 April 1994
Girish C, Ferat S (2014) A survey on feature selection methods. Comput Electr Eng 40:16–28. https://doi.org/10.1016/j.compeleceng.2013.11.024
Wan Y, Wang M, Ye Z, Lai X (2016) A feature selection method based on modified binary coded ant colony optimization algorithm. Appl Soft Comput 49:248–258. https://doi.org/10.1016/j.asoc.2016.08.011
Ibrahim A, Maria H, Hossam F, Nailah A, Ali AH, Majdi MM, Mohamed EAE, Seyedali M (2020) A dynamic locality multi-objective salp swarm algorithm for feature selection. Comput Ind Eng 147:106628. https://doi.org/10.1016/j.cie.2020.106628
Emrah H, Bing X, Mengjie Z (2018) Differential evolution for filter feature selection based on information theory and feature ranking. Knowl Based Syst 140:103–119. https://doi.org/10.1016/j.knosys.2017.10.028
Yong Z, DunWei G, XiaoZhi G, Tian T, Xiaoyan S (2020) Binary differential evolution with self-learning for multi-objective feature selection. Inf Sci 507:67–85. https://doi.org/10.1016/j.ins.2019.08.040
Ke C, Fengyu Z, Xianfeng Y (2019) Hybrid particle swarm optimization with spiral-shaped mechanism for feature selection. Expert Syst Appl 128:140–156. https://doi.org/10.1016/j.eswa.2019.03.039
Maryam A, Behrouz MB (2018) Optimizing multi-objective PSO based feature selection method using a feature elitism mechanism. Expert Syst Appl 113:499–514. https://doi.org/10.1016/j.eswa.2018.07.013
Bing X, Mengjie Z, Will NB (2012) New Fitness Functions in Binary Particle Swarm Optimisation for Feature Selection. Paper presented at the IEEE Congress on Evolutionary Computation, Brisbane, Australia, 10–15 June 2012
Bach HN, Bing X, Peter A (2019) PSO with surrogate models for feature selection: static and dynamic clustering-based methods. Memetic Comput 10:291–300. https://doi.org/10.1007/s12293-018-0254-9
Wang L, Gao Y, Gao S, Yong X (2021) A new feature selection method based on a self-variant genetic algorithm applied to android malware detection. Symmetry 13:1290. https://doi.org/10.3390/sym13071290
Eid E, Hossam MZ, Aboul EH (2016) Binary grey wolf optimization approaches for feature selection. Neurocomputing 172:371–381. https://doi.org/10.1016/j.neucom.2015.06.083
Pei H, JengShyang P, ShuChuan C (2020) Improved Binary Grey Wolf Optimizer and Its application for feature selection. Knowl Based Syst 195:105746. https://doi.org/10.1016/j.knosys.2020.105746
Mafarja MM, Ibrahim A, Hossam F, Abdelaziz IH, Ala MA, Seyedali M (2019) Binary grasshopper optimisation algorithm approaches for feature selection problems. Expert Syst Appl 117:267–286. https://doi.org/10.1016/j.eswa.2018.09.015
Majdi MM, Ibrahim A, Ali AH, Abdelaziz IH, Hossam F, Ala MA, Seyedali M (2018) Evolutionary population dynamics and grasshopper optimization approaches for feature selection problems. Knowl Based Syst 145:25–45. https://doi.org/10.1016/j.knosys.2017.12.037
Gehad IS, Ghada K, Mohamed HH (2018) A novel chaotic salp swarm algorithm for global optimization and feature selection. Appl Intell 48:3462–3481. https://doi.org/10.1007/s10489-018-1158-6
Hossam F, Majdi MM, Ali AH, Ibrahim A, Ala MA, Seyedali M, Hamido F (2018) An efficient binary salp swarm algorithm with crossover scheme for feature. Knowl Based Syst 154:43–67. https://doi.org/10.1016/j.knosys.2018.05.009
Emrah H, Bing X, Mengjie Z, Dervis K, Bahriye A (2015) A multi-objective artificial bee colony approach to feature selection using fuzzy mutual information. Paper presented at the IEEE Congress on Evolutionary Computation, Sendai, Japan, 25–28 May 2015
Majdi MM, Seyedali M (2017) Hybrid whale optimization algorithm with simulated annealing for feature selection. Neurocomputing 260:302–312. https://doi.org/10.1016/j.neucom.2017.04.053
Majdi MM, Seyedali M (2018) Whale optimization approaches for wrapper feature selection. Appl Soft Comput 62:441–453. https://doi.org/10.1016/j.asoc.2017.11.006
Yanan Z, Renjing L, Xin W, Huiling C, Chengye L (2021) Boosted binary harris hawks optimizer and feature selection. Eng Comput 37:3741–3770. https://doi.org/10.1007/s00366-020-01028-5
Yang XS (2009) Firefly algorithms for multimodal optimization, stochastic algorithms: foundations and applications. SAGA 2009. Lecture notes in computer science. Springer, Berlin, Heidelberg, 5792.
Jinran W, YouGan W, Kevin B, YuChu T, Brodie L, Zhe D (2020) An improved firefly algorithm for global continuous optimization problems. Expert Syst Appl 149:113340. https://doi.org/10.1016/j.eswa.2020.113340
ChunFeng W, WenXin S (2019) A novel firefly algorithm based on gender difference and its convergence. Appl Soft Comput 80:107–124. https://doi.org/10.1016/j.asoc.2019.03.010
Aref Y, Cemal K (2018) A modified firefly algorithm for global minimum optimization. Appl Soft Comput 62:29–44. https://doi.org/10.1016/j.asoc.2017.10.032
Xingsi X (2020) A compact firefly algorithm for matching biomedical ontologies. Knowl Inf Syst 62:2855–2871. https://doi.org/10.1007/s10115-020-01443-6
Asma MA, Abdulqader MM, Abdullatif G (2019) An improved hybrid firefly algorithm for capacitated vehicle routing. Appl Soft Comput 84:1568–4946. https://doi.org/10.1016/j.asoc.2019.105728
Hui W, Wenjun W, Zhihua C, Xinyu Z, Jia Z, Ya L (2018) A new dynamic firefly algorithm for demand estimation of water resources. Inf Sci 438:95–106. https://doi.org/10.1016/j.ins.2018.01.041
Selvakumar B, Muneeswaran K (2019) Firefly algorithm based feature selection for network intrusion detection. Comput Secur 81:148–155. https://doi.org/10.1016/j.cose.2018.11.005
Long Z, Linlin S, Jianhua W (2017) Optimal feature selection using distance-based discrete firefly algorithm with mutual information criterion. Neural Comput Appl 28:2795–2808. https://doi.org/10.1007/s00521-016-2204-0
Yong Z, Xianfang S, DunWei G (2017) A return-cost-based binary firefly algorithm for feature selection. Inf Sci 418:561–574. https://doi.org/10.1016/j.ins.2017.08.047
Bing X, Mengjie Z, Will NB (2014) Particle swarm optimisation for feature selection in classification:novel initialisation and updating mechanisms. Appl Soft Comput 18:261–276. https://doi.org/10.1016/j.asoc.2013.09.018
Bach HN, Bing X, Ivy L, Mengjie Z (2014) PSO and statistical clustering for feature selection: A new representation. Paper presented at the 10th SEAL International Conference, Dunedin, New Zealand,15–18 December 2014
Elnaz P, Nizamettin A (2017) Binary black hole algorithm for feature selection and classification on biological data. Appl Soft Comput 56:94–106. https://doi.org/10.1016/j.asoc.2017.03.002
Hui W, Zhihua C, Hui S, Shahryar R, XinShe Y (2017) Randomly attracted firefly algorithm with neighborhood search and dynamic parameter adjustment mechanism. Soft Comput 21:5325–5339. https://doi.org/10.1007/s00500-016-2116-z
Bach HN, Bing X, Mengjie Z (2020) A survey on swarm intelligence approaches to feature selection in data mining. Swarm Evol Comput 54:100663. https://doi.org/10.1016/j.swevo.2020.100663
Mohamed AB, Doaa E, Ibrahim ME, Victor HCA, Seyedali M (2020) A new fusion of grey wolf optimizer algorithm with a two-phase mutation for feature selection. Expert Syst Appl 139:112824. https://doi.org/10.1016/j.eswa.2019.112824
Dua D, Graff C (2019) UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science
Yudong Z, Shuihua W, Preetha P, Genlin J (2014) Binary PSO with mutation operator for feature selection using decision tree applied to spam detection. Appl Soft Comput 64:22–31. https://doi.org/10.1016/j.knosys.2014.03.015
Hussien AG, Hassanien AE, Houssein EH, Bhattacharyya S, Amin M (2019) S-shaped binary whale optimization algorithm for feature selection. In: Bhattacharyya S, Mukherjee A, Bhaumik H, Das S, Yoshida K (eds) Recent trends in signal and image processing. Springer, Singapore, pp 79–87
Hadeel A, Ahmad S, Khair ES (2020) A feature selection algorithm for intrusion detection system based on pigeon inspired optimizer. Expert Syst Appl 148:113249. https://doi.org/10.1016/j.eswa.2020.113249
Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J Roy Stat Soc: Ser B (Methodol) 57:289–300. https://doi.org/10.1111/j.2517-6161.1995.tb02031.x
Funding
This work is supported by the Key Project of Ningxia Natural Science Foundation (2022AAC02043), Major scientific Research Project of Northern University for Nationalities (ZDZX201901), the Natural Science Foundation of NingXia Hui Autonomous Region (2021AAC03185), Research Startup Foundation of North Minzu University (2020KYQD23), National Natural Science Foundation of China (61561001) and First-class Discipline Construction Fund project of Ningxia Higher Education (NXYLXK2017B09).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that there is no conflict of interests; we do not have any possible conflicts of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yong, X., Gao, Yl. Improved firefly algorithm for feature selection with the ReliefF-based initialization and the weighted voting mechanism. Neural Comput & Applic 35, 275–301 (2023). https://doi.org/10.1007/s00521-022-07755-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-022-07755-8