Skip to main content

Application of Multi-objective Optimization to Feature Selection for a Difficult Data Classification Task

  • Conference paper
  • First Online:
Computational Science – ICCS 2021 (ICCS 2021)

Abstract

Many different decision problems require taking a compromise between the various goals we want to achieve into account. A specific group of features often decides the state of a given object. An example of such a task is the feature selection that allows increasing the decision’s quality while minimizing the cost of features or the total budget. The work’s main purpose is to compare feature selection methods such as the classical approach, the one-objective optimization, and the multi-objective optimization. The article proposes a feature selection algorithm using the Genetic Algorithm with various criteria, i.e., the cost and accuracy. In this way, the optimal Pareto points for the nonlinear problem of multi-criteria optimization were obtained. These points constitute a compromise between two conflicting objectives. By carrying out various experiments on various base classifiers, it has been shown that the proposed approach can be used in the task of optimizing difficult data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/joannagrzyb/moofs.

References

  1. Fort, G., Lambert-Lacroix, S.: Classification using partial least squares with penalized logistic regression. Bioinformatics 21(7), 1104–1111 (2005)

    Article  Google Scholar 

  2. Bellman, R.E.: Adaptive Control Processes: A Guided Tour, vol. 2045. Princeton University Press (2015)

    Google Scholar 

  3. Jimenez, L.O., Landgrebe, D.A.: Hyperspectral data analysis and supervised feature reduction via projection pursuit. IEEE Trans. Geosci. Remote Sens. 37(6), 2653–2667 (1999)

    Article  Google Scholar 

  4. Hughes, G.: On the mean accuracy of statistical pattern recognizers. IEEE Trans. Inf. Theor. 14(1), 55–63 (1968)

    Article  Google Scholar 

  5. Klinger, A.: Letter to the editor–improper solutions of the vector maximum problem. Oper. Res. 15(3), 570–572 (1967)

    Article  Google Scholar 

  6. Vakhania, N., Werner, F.: A brief look at multi-criteria problems: multi-threshold optimization versus pareto-optimization. In: Multi-criteria Optimization-Pareto-optimal and Related Principles. IntechOpen (2020)

    Google Scholar 

  7. Penar, W., Wozniak, M.: Cost-sensitive methods of constructing hierarchical classifiers. Exp. Syst. 27(3), 146–155 (2010)

    Article  Google Scholar 

  8. De la Hoz, E., De La Hoz, E., Ortiz, A., Ortega, J., Martínez-Álvarez, A.: Feature selection by multi-objective optimisation: application to network anomaly detection by hierarchical self-organising maps. Knowl. Based Syst. 71, 322–338 (2014)

    Article  Google Scholar 

  9. Jiang, L., Kong, G., Li, C.: Wrapper framework for test-cost-sensitive feature selection. IEEE Trans. Syst. Man Cybern. Syst. 51, 1747–1756 (2021)

    Google Scholar 

  10. Zhang, Y., Cheng, S., Shi, Y., Gong, D., Zhao, X.: Cost-sensitive feature selection using two-archive multi-objective artificial bee colony algorithm. Exp. Syst. Appl. 137, 46–58 (2019)

    Article  Google Scholar 

  11. Karande, K.J., Badage, R.N.: Facial feature extraction using independent component analysis. In: Annual International Conference on Intelligent Computing, Computer Science and Information Systems, ICCSIS 2016, pp. 28–29 (2016)

    Google Scholar 

  12. Vyas, R.A., Shah, S.M.: Comparision of PCA and LDA techniques for face recognition feature based extraction with accuracy enhancement. Int. Res. J. Eng. Technol. (IRJET) 4(6), 3332–3336 (2017)

    Google Scholar 

  13. Topolski, M.: The modified principal component analysis feature extraction method for the task of diagnosing chronic lymphocytic leukemia type B-CLL. J. Univ. Comput. Sci. 26(6), 734–746 (2020)

    Google Scholar 

  14. Topolski, M.: Application of the stochastic gradient method in the construction of the main components of PCA in the task diagnosis of multiple sclerosis in children. In: Krzhizhanovskaya, V.V., et al. (eds.) ICCS 2020. LNCS, vol. 12140, pp. 35–44. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-50423-6_3

    Chapter  Google Scholar 

  15. Bommert, A., Sun, X., Bischl, B., Rahnenführer, J., Lang, M.: Benchmark for filter methods for feature selection in high-dimensional classification data. Comput. Stat. Data Anal. 143, 106839 (2020)

    Article  MathSciNet  Google Scholar 

  16. Cai, J., Luo, J., Wang, S., Yang, S.: Feature selection in machine learning: a new perspective. Neurocomputing 300, 70–79 (2018)

    Article  Google Scholar 

  17. Risqiwati, D., Wibawa, A.D., Pane, E.S., Islamiyah, W.R., Tyas, A.E., Purnomo, M.H.: Feature selection for EEG-based fatigue analysis using Pearson correlation. In: 2020 International Seminar on Intelligent Technology and Its Applications (ISITIA), pp. 164–169. IEEE (2020)

    Google Scholar 

  18. Remeseiro, B., Bolon-Canedo, V.: A review of feature selection methods in medical applications. Comput. Biol. Med. 112, 103375 (2019)

    Article  Google Scholar 

  19. Yevseyeva, I., Basto-Fernandes, V., Ruano-OrdáS, D., MéNdez, J.R.: Optimising anti-spam filters with evolutionary algorithms. Exp. Syst. Appl. 40(10), 4010–4021 (2013)

    Article  Google Scholar 

  20. Wang, P., Emmerich, M., Li, R., Tang, K., Bäck, T., Yao, X.: Convex hull-based multiobjective genetic programming for maximizing receiver operating characteristic performance. IEEE Trans. Evol. Comput. 19(2), 188–200 (2014)

    Article  Google Scholar 

  21. Geiger, M.J., Sevaux, M.: The biobjective inventory routing problem – problem solution and decision support. In: Pahl, J., Reiners, T., Voß, S. (eds.) INOC 2011. LNCS, vol. 6701, pp. 365–378. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21527-8_41

    Chapter  Google Scholar 

  22. Hopfe, C.J., Emmerich, M.T.M., Marijt, R., Hensen, J.: Robust multi-criteria design optimisation in building design. In: Proceedings of Building Simulation and Optimization, Loughborough, UK, pp. 118–125 (2012)

    Google Scholar 

  23. Rosenthal, S., Borschbach, M.: Design perspectives of an evolutionary process for multi-objective molecular optimization. In: Trautmann, H., et al. (eds.) EMO 2017. LNCS, vol. 10173, pp. 529–544. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-54157-0_36

    Chapter  Google Scholar 

  24. Thaseen, I.S., Kumar, C.A.: Intrusion detection model using fusion of chi-square feature selection and multi class SVM. J. King Saud Univ. Comput. Inf. Sci. 29(4), 462–472 (2017)

    Article  Google Scholar 

  25. Enguerran, G., Abadi, M., Alata, O.: An hybrid method for feature selection based on multiobjective optimization and mutual information. J. Inf. Math. Sci. 7(1), 21–48 (2015)

    Google Scholar 

  26. dos S Santana, L.E.A., de Paula Canuto, A.M.: Filter-based optimization techniques for selection of feature subsets in ensemble systems. Exp. Syst. Appl. 41(4), 1622–1631 (2014)

    Google Scholar 

  27. Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6(2), 182–197 (2002)

    Article  Google Scholar 

  28. Singh, U., Singh, S.N.: Optimal feature selection via NSGA-II for power quality disturbances classification. IEEE Trans. Ind. Inf. 14(7), 2994–3002 (2017)

    Article  Google Scholar 

  29. Razali, N.M., Geraghty, J., et al.: Genetic algorithm performance with different selection strategies in solving TSP. In: Proceedings of the World Congress on Engineering, vol. 2, pp. 1–6. International Association of Engineers Hong Kong (2011)

    Google Scholar 

  30. Kou, G., Yang, P., Peng, Y., Xiao, F., Chen, Y., Alsaadi, F.E.: Evaluation of feature selection methods for text classification with small datasets using multiple criteria decision-making methods. Appl. Soft Comput. 86, 105836 (2020)

    Article  Google Scholar 

  31. Behzadian, M., Kazemzadeh, R.B., Albadvi, A., Aghdasi, M.: PROMETHEE: a comprehensive literature review on methodologies and applications. Eur. J. Oper. Res. 200(1), 198–215 (2010)

    Article  Google Scholar 

  32. Lichman, M., et al.: UCI Machine Learning Repository (2013)

    Google Scholar 

  33. Blank, J., Deb, K.: Pymoo: multi-objective optimization in Python. IEEE Access 8, 89497–89509 (2020)

    Article  Google Scholar 

  34. Hunter, J.D.: Matplotlib: a 2D graphics environment. Comput. Sci. Eng. 9(3), 90–95 (2007)

    Article  Google Scholar 

  35. McKinney, W.: Data structures for statistical computing in Python. In: van der Walt, S., Millman, J. (eds.) Proceedings of the 9th Python in Science Conference, pp. 56–61 (2010)

    Google Scholar 

  36. Oliphant, T.E.: A Guide to NumPy, vol. 1. Trelgol Publishing USA (2006)

    Google Scholar 

  37. Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This work was supported by the Polish National Science Centre under the grant No. 2019/35/B/ST6/04442.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Joanna Grzyb .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Grzyb, J., Topolski, M., Woźniak, M. (2021). Application of Multi-objective Optimization to Feature Selection for a Difficult Data Classification Task. In: Paszynski, M., Kranzlmüller, D., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M. (eds) Computational Science – ICCS 2021. ICCS 2021. Lecture Notes in Computer Science(), vol 12744. Springer, Cham. https://doi.org/10.1007/978-3-030-77967-2_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-77967-2_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-77966-5

  • Online ISBN: 978-3-030-77967-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics