Skip to main content

Advertisement

Log in

A new ensemble feature selection approach based on genetic algorithm

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

In the ensemble feature selection method, if the weight adjustment is performed on each feature subset used, the ensemble effect can be significantly different; therefore, how to find the optimized weight vector is a key and challenging problem. Aiming at this optimization problem, this paper proposes an ensemble feature selection approach based on genetic algorithm (EFS-BGA). After each base feature selector generates a feature subset, the EFS-BGA method obtains the optimized weight of each feature subset through genetic algorithm, which is different from traditional genetic algorithm directly processing single features. We divide the EFS-BGA algorithm into two types. The first is a complete ensemble feature selection method; based on the first, we further propose the selective EFS-BGA model. After that, through mathematical analysis, we theoretically explain why weight adjustment is an optimization problem and how to optimize. Finally, through the comparative experiments on multiple data sets, the advantages of the EFS-BGA algorithm in this paper over the previous ensemble feature selection algorithms are explained in practice.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Altman NS (1992) An introduction to kernel and nearest-neighbor nonparametric regression. Am Stat 46(3):175–185. https://doi.org/10.1080/00031305.1992.10475879

    Article  MathSciNet  Google Scholar 

  • Baraniuk RG (2007) Compressive sensing [lecture notes]. IEEE Signal Process Mag 24(4):118–121

    Article  Google Scholar 

  • Breiman L (1995) Better subset regression using the nonnegative garrote’. Technometrics 37(4):373–84

    Article  MathSciNet  MATH  Google Scholar 

  • Buitinck L, Louppe G, Blondel M, Pedregosa F, Mueller A, Grisel O et al (2013) Api design for machine learning software: experiences from the scikit-learn project. Eprint Arxiv

  • Das AK, Das S, Ghosh A (2017) Ensemble feature selection using bi-objective genetic algorithm. Knowl-Based Syst 123:116–127

    Article  Google Scholar 

  • Dua D, Graff C (2019) UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science

  • Fortin F-A, De Rainville F-M, Gardner M-A, Marc P, Christian G (2012) DEAP: evolutionary algorithms made easy. J Mach Learn Res 13:2171–2175

    MathSciNet  Google Scholar 

  • Friedman JH (1997) On bias, variance, 0/1–loss, and the curse-of-dimensionality. Data Min Knowl Discov 1(1):55–77

    Article  MathSciNet  Google Scholar 

  • Ghamisi P, Benediktsson JA (2015) Feature selection based on hybridization of genetic algorithm and particle swarm optimization. IEEE Geosci Remote Sens Lett 12(2):309–313

    Article  Google Scholar 

  • Ghareb AS, Bakar AA, Hamdan AR (2016) Hybrid feature selection based on enhanced genetic algorithm for text categorization. Expert Syst Appl 49:31–47

    Article  Google Scholar 

  • Goldberg D (1989) Genetic algorithms in search, optimization and machine learning. Addison-Wesley Professional, Reading, MA. ISBN 978-0201157673

  • Goldberg DE, Holland JH (1988) Genetic algorithms and machine learning. Mach Learn 3(2):95–99

    Article  Google Scholar 

  • Huan L, Hiroshi M (2007) Computational methods of feature selection. Chapman & Hall/CRC Data Mining and Knowledge Discovery Series, Taylor & Francis, New York. Chapman & Hall/CRC

  • Hyndman RJ, Koehler AB (2006) Another look at measures of forecast accuracy. Int J Forecast 22(4):679–688. https://doi.org/10.1016/j.ijforecast.2006.03.001

    Article  Google Scholar 

  • Jiang S, Chin KS, Wang L et al (2017) Modified genetic algorithm-based feature selection combined with pre-trained deep neural network for demand forecasting in outpatient department. Expert Syst Appl 82:216–230

    Article  Google Scholar 

  • Kira K, Rendell L (1992) The feature selection problem: traditional methods and a new algorithm. In: National conference on artificial intelligence

  • Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97(1–2):273–324

    Article  MATH  Google Scholar 

  • Kononenko I (1994) Estimating attributes: analysis and extensions of RELIEF. In: European conference on machine learning

  • Lehmann EL, Casella G (1998) Theory of point estimation (2nd ed.). New York: Springer. ISBN 0-387-98502-6. MR 1639875

  • Liu H, Setiono R (1996) Feature selection and classification—a probabilistic wrapper approach. industrial and engineering applications of artificial intelligence and expert systems

  • Mitchell L, Sloan TM, Mewissen M et al (2014) Parallel classification and feature selection in microarray data using SPRINT. Concurr Comput Pract Exp 26(4):854–865

    Article  Google Scholar 

  • Qu EQ, Liu K, Zhang AL, Wang J, Sun H (2016) Feature selection of steel surface defect based on P-ReliefF method. In: 2016 35th Chinese control conference (CCC), control conference (CCC), 2016 35th Chinese. 2016:7164. https://doi.org/10.1109/ChiCC.2016.7554489.

  • Sadeghi J, Niaki STA, Malekian MR, Sadeghi S (2016) Optimising multi-item economic production quantity model with trapezoidal fuzzy demand and backordering: two tuned meta-heuristics. Eur J Ind Eng 10(2):170. https://doi.org/10.1504/ejie.2016.075847.ISSN 1751-5254

  • Saeys Y, Abeel T, Peer Y (2008) Robust feature selection using ensemble feature selection techniques. In: European conference on machine learning and knowledge discovery in databases. Springer, Berlin, pp 313–325

  • Santosa F, Symes WW (1986) Linear inversion of band-limited reflection seismograms. SIAM J Sci Stat Comput 7(4):1307–1330

    Article  MathSciNet  MATH  Google Scholar 

  • Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B (Methodol) 58(1):267–88. JSTOR 2346178

  • Weston J, Elisseeff A, Schölkopf B, Tipping M (2003) Use of the zero-norm with linear models and kernel methods. J Mach Learn Res 3(2003):1439–1461

    MathSciNet  MATH  Google Scholar 

  • Whitley D (1994) “A genetic algorithm tutorial” (PDF). Stat Comput 4(2):65–85. https://doi.org/10.1007/BF00175354

    Article  Google Scholar 

  • Willmott Cort J, Matsuura K (2005) Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Climate Res 30:79–82. https://doi.org/10.3354/cr030079

    Article  Google Scholar 

  • Xue X, Yao M, Wu Z (2018) A novel ensemble-based wrapper method for feature selection using extreme learning machine and genetic algorithm. Knowl Inf Syst 57(2):389–412

    Article  Google Scholar 

Download references

Acknowledgements

The funding was provided by NSFC (Grant Nos. U1509216, U1866602, 61472099, 61602129) and the National Key Research and Development Program of China (Grant No. 2016YFB1000703).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hongzhi Wang.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Communicated by V. Loia.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, H., He, C. & Li, Z. A new ensemble feature selection approach based on genetic algorithm. Soft Comput 24, 15811–15820 (2020). https://doi.org/10.1007/s00500-020-04911-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-020-04911-x

Keywords

Navigation