A new ensemble feature selection approach based on genetic algorithm

Wang, Hongzhi; He, Chengquan; Li, Zhuping

doi:10.1007/s00500-020-04911-x

A new ensemble feature selection approach based on genetic algorithm

Methodologies and Application
Published: 11 April 2020

Volume 24, pages 15811–15820, (2020)
Cite this article

Soft Computing Aims and scope Submit manuscript

629 Accesses
12 Citations
1 Altmetric
Explore all metrics

Abstract

In the ensemble feature selection method, if the weight adjustment is performed on each feature subset used, the ensemble effect can be significantly different; therefore, how to find the optimized weight vector is a key and challenging problem. Aiming at this optimization problem, this paper proposes an ensemble feature selection approach based on genetic algorithm (EFS-BGA). After each base feature selector generates a feature subset, the EFS-BGA method obtains the optimized weight of each feature subset through genetic algorithm, which is different from traditional genetic algorithm directly processing single features. We divide the EFS-BGA algorithm into two types. The first is a complete ensemble feature selection method; based on the first, we further propose the selective EFS-BGA model. After that, through mathematical analysis, we theoretically explain why weight adjustment is an optimization problem and how to optimize. Finally, through the comparative experiments on multiple data sets, the advantages of the EFS-BGA algorithm in this paper over the previous ensemble feature selection algorithms are explained in practice.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-objective Optimization Based Feature Selection Using Correlation

A novel ensemble-based wrapper method for feature selection using extreme learning machine and genetic algorithm

Article 17 November 2017

Relevant feature selection and ensemble classifier design using bi-objective genetic algorithm

Article 05 March 2019

References

Altman NS (1992) An introduction to kernel and nearest-neighbor nonparametric regression. Am Stat 46(3):175–185. https://doi.org/10.1080/00031305.1992.10475879
Article MathSciNet Google Scholar
Baraniuk RG (2007) Compressive sensing [lecture notes]. IEEE Signal Process Mag 24(4):118–121
Article Google Scholar
Breiman L (1995) Better subset regression using the nonnegative garrote’. Technometrics 37(4):373–84
Article MathSciNet MATH Google Scholar
Buitinck L, Louppe G, Blondel M, Pedregosa F, Mueller A, Grisel O et al (2013) Api design for machine learning software: experiences from the scikit-learn project. Eprint Arxiv
Das AK, Das S, Ghosh A (2017) Ensemble feature selection using bi-objective genetic algorithm. Knowl-Based Syst 123:116–127
Article Google Scholar
Dua D, Graff C (2019) UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science
Fortin F-A, De Rainville F-M, Gardner M-A, Marc P, Christian G (2012) DEAP: evolutionary algorithms made easy. J Mach Learn Res 13:2171–2175
MathSciNet Google Scholar
Friedman JH (1997) On bias, variance, 0/1–loss, and the curse-of-dimensionality. Data Min Knowl Discov 1(1):55–77
Article MathSciNet Google Scholar
Ghamisi P, Benediktsson JA (2015) Feature selection based on hybridization of genetic algorithm and particle swarm optimization. IEEE Geosci Remote Sens Lett 12(2):309–313
Article Google Scholar
Ghareb AS, Bakar AA, Hamdan AR (2016) Hybrid feature selection based on enhanced genetic algorithm for text categorization. Expert Syst Appl 49:31–47
Article Google Scholar
Goldberg D (1989) Genetic algorithms in search, optimization and machine learning. Addison-Wesley Professional, Reading, MA. ISBN 978-0201157673
Goldberg DE, Holland JH (1988) Genetic algorithms and machine learning. Mach Learn 3(2):95–99
Article Google Scholar
Huan L, Hiroshi M (2007) Computational methods of feature selection. Chapman & Hall/CRC Data Mining and Knowledge Discovery Series, Taylor & Francis, New York. Chapman & Hall/CRC
Hyndman RJ, Koehler AB (2006) Another look at measures of forecast accuracy. Int J Forecast 22(4):679–688. https://doi.org/10.1016/j.ijforecast.2006.03.001
Article Google Scholar
Jiang S, Chin KS, Wang L et al (2017) Modified genetic algorithm-based feature selection combined with pre-trained deep neural network for demand forecasting in outpatient department. Expert Syst Appl 82:216–230
Article Google Scholar
Kira K, Rendell L (1992) The feature selection problem: traditional methods and a new algorithm. In: National conference on artificial intelligence
Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97(1–2):273–324
Article MATH Google Scholar
Kononenko I (1994) Estimating attributes: analysis and extensions of RELIEF. In: European conference on machine learning
Lehmann EL, Casella G (1998) Theory of point estimation (2nd ed.). New York: Springer. ISBN 0-387-98502-6. MR 1639875
Liu H, Setiono R (1996) Feature selection and classification—a probabilistic wrapper approach. industrial and engineering applications of artificial intelligence and expert systems
Mitchell L, Sloan TM, Mewissen M et al (2014) Parallel classification and feature selection in microarray data using SPRINT. Concurr Comput Pract Exp 26(4):854–865
Article Google Scholar
Qu EQ, Liu K, Zhang AL, Wang J, Sun H (2016) Feature selection of steel surface defect based on P-ReliefF method. In: 2016 35th Chinese control conference (CCC), control conference (CCC), 2016 35th Chinese. 2016:7164. https://doi.org/10.1109/ChiCC.2016.7554489.
Sadeghi J, Niaki STA, Malekian MR, Sadeghi S (2016) Optimising multi-item economic production quantity model with trapezoidal fuzzy demand and backordering: two tuned meta-heuristics. Eur J Ind Eng 10(2):170. https://doi.org/10.1504/ejie.2016.075847.ISSN 1751-5254
Saeys Y, Abeel T, Peer Y (2008) Robust feature selection using ensemble feature selection techniques. In: European conference on machine learning and knowledge discovery in databases. Springer, Berlin, pp 313–325
Santosa F, Symes WW (1986) Linear inversion of band-limited reflection seismograms. SIAM J Sci Stat Comput 7(4):1307–1330
Article MathSciNet MATH Google Scholar
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B (Methodol) 58(1):267–88. JSTOR 2346178
Weston J, Elisseeff A, Schölkopf B, Tipping M (2003) Use of the zero-norm with linear models and kernel methods. J Mach Learn Res 3(2003):1439–1461
MathSciNet MATH Google Scholar
Whitley D (1994) “A genetic algorithm tutorial” (PDF). Stat Comput 4(2):65–85. https://doi.org/10.1007/BF00175354
Article Google Scholar
Willmott Cort J, Matsuura K (2005) Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Climate Res 30:79–82. https://doi.org/10.3354/cr030079
Article Google Scholar
Xue X, Yao M, Wu Z (2018) A novel ensemble-based wrapper method for feature selection using extreme learning machine and genetic algorithm. Knowl Inf Syst 57(2):389–412
Article Google Scholar

Download references

Acknowledgements

The funding was provided by NSFC (Grant Nos. U1509216, U1866602, 61472099, 61602129) and the National Key Research and Development Program of China (Grant No. 2016YFB1000703).

Author information

Authors and Affiliations

Harbin Institute of Technology, Harbin, China
Hongzhi Wang, Chengquan He & Zhuping Li

Authors

Hongzhi Wang
View author publications
You can also search for this author in PubMed Google Scholar
Chengquan He
View author publications
You can also search for this author in PubMed Google Scholar
Zhuping Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hongzhi Wang.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Communicated by V. Loia.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, H., He, C. & Li, Z. A new ensemble feature selection approach based on genetic algorithm. Soft Comput 24, 15811–15820 (2020). https://doi.org/10.1007/s00500-020-04911-x

Download citation

Published: 11 April 2020
Issue Date: October 2020
DOI: https://doi.org/10.1007/s00500-020-04911-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A new ensemble feature selection approach based on genetic algorithm

Abstract

Access this article

Similar content being viewed by others

Multi-objective Optimization Based Feature Selection Using Correlation

A novel ensemble-based wrapper method for feature selection using extreme learning machine and genetic algorithm

Relevant feature selection and ensemble classifier design using bi-objective genetic algorithm

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A new ensemble feature selection approach based on genetic algorithm

Abstract

Access this article

Similar content being viewed by others

Multi-objective Optimization Based Feature Selection Using Correlation

A novel ensemble-based wrapper method for feature selection using extreme learning machine and genetic algorithm

Relevant feature selection and ensemble classifier design using bi-objective genetic algorithm

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation