Abstract
Feature selection algorithms based on evolutionary computation have continued to emerge, and most of them have achieved outstanding results. However, there are two drawbacks when facing high-dimensional datasets: firstly, it is difficult to reduce features effectively, and secondly, the “curse of dimensionality”. To alleviate those problems, we take the initial population generation as an entry point and propose a variant initial population generator, which can improve diversity and initialize populations randomly throughout the solution space. However, during the experimental process, it was found that the improved diversity would cause the algorithm to converge too fast and thus lead to premature. Therefore, we introduced multi-population techniques to balance diversity and convergence speed, and finally formed the MPF-FS framework. To prove the effectiveness of this framework, two feature selection algorithms, multi-population multi-objective artificial bee colony algorithm and multi-population non-dominated sorting genetic algorithm II, are implemented based on this framework. Nine well-known public datasets were used in this study, and the results reveal that the two proposed multi-population methods on high-dimensional datasets can reduce more features without reducing (or even improving) classification accuracy, which outperforms the corresponding single-population algorithms. Further compared to the state-of-the-art methods, our method still shows promising results.
Access this article
We’re sorry, something doesn't seem to be working properly.
Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data Availibility
The data that support the finding of this study are openly available in UCI machine learning repository at http://archive.ics.uci.edu/ml, reference number [38], and FEATURE SELECTION DATASETS at https://jundongl.github.io/scikit-feature/datasets.html, reference number (https://jundongl.github.io/scikit-feature/datasets.html)
References
Li J, Cheng K, Wang S, Morstatter F, Trevino RP, Tang J, Liu H (2017) Feature selection: A data perspective. ACM Computing Surveys (CSUR). 50(6):1–45
Zhu Z, Ong Y-S, Dash M (2007) Wrapper–filter feature selection algorithm using a memetic framework. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 37(1):70–76
Gunantara N (2018) A review of multi-objective optimization: Methods and its applications. Cogent Engineering 5(1):1502242
Battiti R (1994) Using mutual information for selecting features in supervised neural net learning. IEEE Transactions on Neural Networks 5(4):537–550
Robnik-Šikonja M, Kononenko I (2003) Theoretical and empirical analysis of relieff and rrelieff. Machine Learning 53(1):23–69
Song Q, Jiang H, Liu J (2017) Feature selection based on fda and f-score for multi-class classification. Expert Systems with Applications 81:22–27
Caruana R, Freitag D (1994) Greedy attribute selection. In: Machine Learning Proceedings 1994, Elsevier, pp 28–36
Gutlein M, Frank E, Hall M, Karwath A (2009) Large-scale attribute selection using wrappers. In: 2009 IEEE symposium on computational intelligence and data mining, IEEE, pp 332–339
Zhang Y, Gong D-w, Gao X-z, Tian T, Sun X-y (2020) Binary differential evolution with self-learning for multi-objective feature selection. Information Sciences 507:67–85
Mirzaei A, Mohsenzadeh Y, Sheikhzadeh H (2017) Variational relevant sample-feature machine: a fully bayesian approach for embedded feature selection. Neurocomputing 241:181–190
Rostami M, Berahmand K, Nasiri E, Forouzandeh S (2021) Review of swarm intelligence-based feature selection methods. Engineering Applications of Artificial Intelligence 100:104210
Hancer E, Xue B, Zhang M, Karaboga D, Akay B (2018) Pareto front feature selection based on artificial bee colony optimization. Information Sciences 422:462–479
Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist multiobjective genetic algorithm: Nsga-ii. IEEE Transactions on Evolutionary Computation 6(2):182–197
Xue Y, Tang T, Pang A, Liu X (2020) Self-adaptive parameter and strategy based particle swarm optimization for large-scale feature selection problems with multiple classifiers. Applied Soft Computing 88:106031
Song X-F, Zhang Y, Gong D-W, Gao X-Z (2021) A fast hybrid feature selection based on correlation-guided clustering and particle swarm optimization for high-dimensional data. IEEE Transactions on Cybernetics
Chen K, Xue B, Zhang M, Zhou F (2020) An evolutionary multitasking-based feature selection method for high-dimensional classification. IEEE Transactions on Cybernetics
Song X-F, Zhang Y, Guo Y-N, Sun X-Y, Wang Y-L (2020) Variable-size cooperative coevolutionary particle swarm optimization for feature selection on high-dimensional data. IEEE Transactions on Evolutionary Computation 24(5):882–895
Shunmugapriya P, Kanmani S (2017) A hybrid algorithm using ant and bee colony optimization for feature selection and classification (ac-abc hybrid). Swarm and Evolutionary Computation 36:27-36
Neggaz N, Ewees AA, Abd Elaziz M, Mafarja M (2020) Boosting salp swarm algorithm by sine cosine algorithm and disrupt operator for feature selection. Expert Systems with Applications 145:113103
Liu S, Wang H, Peng W, Yao W (2022) A surrogate-assisted evolutionary feature selection algorithm with parallel random grouping for high-dimensional classification. IEEE Transactions on Evolutionary Computation
Xue B, Zhang M, Browne WN (2012) Particle swarm optimization for feature selection in classification: A multi-objective approach. IEEE Transactions on Cybernetics 43(6):1656–1671
Amoozegar M, Minaei-Bidgoli B (2018) Optimizing multi-objective pso based feature selection method using a feature elitism mechanism. Expert Systems with Applications 113:499–514
Hu Y, Zhang Y, Gong D (2020) Multiobjective particle swarm optimization for feature selection with fuzzy cost. IEEE Transactions on Cybernetics 51(2):874–888
Zhu Y, Liang J, Chen J, Ming Z (2017) An improved nsga-iii algorithm for feature selection used in intrusion detection. Knowledge-Based Systems 116:74–85
González J, Ortega J, Damas M, Martín-Smith P, Gan JQ (2019) A new multi-objective wrapper method for feature selection-accuracy and stability analysis for bci. Neurocomputing 333:407–418
Zhang Y, Cheng S, Shi Y, Gong D-w, Zhao X (2019) Cost-sensitive feature selection using two-archive multi-objective artificial bee colony algorithm. Expert Systems with Applications 137:46–58
Xu H, Xue B, Zhang M (2020) A duplication analysis-based evolutionary algorithm for biobjective feature selection. IEEE Transactions on Evolutionary Computation 25(2):205–218
Cheng F, Chu F, Xu Y, Zhang L (2021) A steering-matrix-based multiobjective evolutionary algorithm for high-dimensional feature selection. IEEE Transactions on Cybernetics
Nguyen BH, Xue B, Andreae P, Ishibuchi H, Zhang M (2019) Multiple reference points-based decomposition for multiobjective feature selection in classification: Static and dynamic mechanisms. IEEE Transactions on Evolutionary Computation 24(1):170–184
Al-Tashi Q, Abdulkadir SJ, Rais HM, Mirjalili S, Alhussian H (2020) Approaches to multi-objective feature selection: A systematic literature review. IEEE Access 8:125076–125096
Ma H, Shen S, Yu M, Yang Z, Fei M, Zhou H (2019) Multi-population techniques in nature inspired optimization algorithms: A comprehensive survey. Swarm and Evolutionary Computation 44:365–387
Li C, Nguyen TT, Yang M, Yang S, Zeng S (2015) Multi-population methods in unconstrained continuous dynamic environments: The challenges. Information Sciences 296:95–118
Li Y, Zeng X (2008) Feature selection method with multi-population agent genetic algorithm. In: International Conference on Neural Information Processing, Springer, pp 493–500
Kılıç F, Kaya Y, Yildirim S (2021) A novel multi population based particle swarm optimization for feature selection. Knowledge-Based Systems 219:106894
Deb K, Agrawal S, Pratap A, Meyarivan T (2000) A fast elitist non-dominated sorting genetic algorithm for multi-objective optimization: Nsga-ii. In: International Conference on Parallel Problem Solving From Nature, Springer, pp 849–858
Raquel CR, Naval PC Jr (2005) An effective use of crowding distance in multiobjective particle swarm optimization. In: Proceedings of the 7th Annual conference on Genetic and Evolutionary Computation, pp 257–264
Akbari R, Hedayatzadeh R, Ziarati K, Hassanizadeh B (2012) A multi-objective artificial bee colony algorithm. Swarm and Evolutionary Computation 2:39–52
Dua D, Graff C (2017) UCI machine learning repository. http://archive.ics.uci.edu/ml
Samieiyan B, MohammadiNasab P, Mollaei MA, Hajizadeh F, Kangavari M (2022) Novel optimized crow search algorithm for feature selection. Expert Systems with Applications 117486
Wang X, Wang Y, Wong K-C, Li X (2022) A self-adaptive weighted differential evolution approach for large-scale feature selection. Knowledge-Based Systems 235:107633
Xue Y, Xue B, Zhang M (2019) Self-adaptive particle swarm optimization for large-scale feature selection in classification. ACM Transactions on Knowledge Discovery from Data (TKDD) 13(5):1–27
Li A-D, He Z, Wang Q, Zhang Y (2019) Key quality characteristics selection for imbalanced production data using a two-phase bi-objective feature selection method. European Journal of Operational Research 274(3):978–989
Li A-D, Xue B, Zhang M (2020) Multi-objective feature selection using hybridization of a genetic algorithm and direct multisearch for key quality characteristic selection. Information Sciences 523:245–265
Auger A, Bader J, Brockhoff D, Zitzler E (2009) Theory of the hypervolume indicator: optimal \(\mu \)-distributions and the choice of the reference point. In: Proceedings of the tenth ACM SIGEVO workshop on Foundations of genetic algorithms, pp 87–102
Funding
This work is supported in part by the National Key Research and Development Program of China (No. 2020YFB18 05400); in part by the National Natural Science Foundation of China (No. U19A2068, No. 62032002, and No. 62101358); in part by the China Postdoctoral Science Foundation (No. 2020M683345); Fundamental Research Funds for the Central Universities (Grant No. SCU2021D052)
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yang, J., He, J., Li, W. et al. MPF-FS: A multi-population framework based on multi-objective optimization algorithms for feature selection. Appl Intell 53, 22179–22199 (2023). https://doi.org/10.1007/s10489-023-04696-0
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-023-04696-0