Skip to main content

Hybrid Intelligent Parsimony Search in Small High-Dimensional Datasets

  • Conference paper
  • First Online:
Hybrid Artificial Intelligent Systems (HAIS 2023)

Abstract

The search for machine learning models that generalize well with small high-dimensional datasets is a current challenge. This paper shows a specific hybrid methodology for this kind of problems combining HYB-PARSIMONY and Bayesian Optimization. The methodology proposes to use HYB-PARSIMONY with different random seeds and select those features that had the highest mean probability. Subsequently, with these features, a hyperparameter adjustment is performed with Bayesian Optimization. The results show that the methodology substantially improves the degree of generalization and parsimony of the obtained models compared to previous methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    HYB-PARSIMONY is available for Python at https://github.com/jodivaso/HYBparsimony.

  2. 2.

    The total number of experiments was 115170, resulting from all combinations.

  3. 3.

    As can be seen in Fig. 5.

References

  1. Antonanzas-Torres, F., Urraca, R., Antonanzas, J., Fernandez-Ceniceros, J., Martinez-de Pison, F.J.: Generation of daily global solar irradiation with support vector machines for regression. Energy Convers. Manage. 96, 277–286 (2015). https://doi.org/10.1016/j.enconman.2015.02.086

    Article  Google Scholar 

  2. Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2016, pp. 785–794. ACM, New York (2016). https://doi.org/10.1145/2939672.2939785

  3. Chuang, L.Y., Tsai, S.W., Yang, C.H.: Improved binary particle swarm optimization using catfish effect for feature selection. Expert Syst. Appl. 38(10), 12699–12707 (2011). https://doi.org/10.1016/j.eswa.2011.04.057

    Article  Google Scholar 

  4. Divasón, J., Pernia-Espinoza, A., Martinez-de Pison, F.J.: New hybrid methodology based on particle swarm optimization with genetic algorithms to improve the search of parsimonious models in high-dimensional databases. In: García Bringas, P., et al. (eds.) HAIS 2022. LNCS, vol. 13469, pp. 335–347. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-15471-3_29

    Chapter  Google Scholar 

  5. Divasón, J., Fernandez-Ceniceros, J., Sanz-Garcia, A., Pernia-Espinoza, A., Martinez-de Pison, F.J.: PSO-PARSIMONY: a method for finding parsimonious and accurate machine learning models with particle swarm optimization. Application for predicting force-displacement curves in T-stub steel connections. Neurocomputing 548, 126414 (2023). https://doi.org/10.1016/j.neucom.2023.126414

  6. Dorogush, A.V., Ershov, V., Gulin, A.: CatBoost: gradient boosting with categorical features support (2018)

    Google Scholar 

  7. Dulce-Chamorro, E., de Pison, F.J.M.: An advanced methodology to enhance energy efficiency in a hospital cooling-water system. J. Build. Eng. 43, 102839 (2021). https://doi.org/10.1016/j.jobe.2021.102839

    Article  Google Scholar 

  8. Erickson, N., et al.: AutoGluon-tabular: robust and accurate AutoML for structured data. arXiv preprint arXiv:2003.06505 (2020)

  9. Karaboga, D., Basturk, B.: Artificial bee colony (ABC) optimization algorithm for solving constrained optimization problems. In: Melin, P., Castillo, O., Aguilar, L.T., Kacprzyk, J., Pedrycz, W. (eds.) IFSA 2007. LNCS (LNAI), vol. 4529, pp. 789–798. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-72950-1_77

    Chapter  MATH  Google Scholar 

  10. Ke, G., et al.: LightGBM: a highly efficient gradient boosting decision tree. Adv. Neural. Inf. Process. Syst. 30, 3146–3154 (2017)

    Google Scholar 

  11. Kennedy, J., Eberhart, R.: Particle swarm optimization. In: Proceedings of ICNN 1995 - International Conference on Neural Networks, vol. 4, pp. 1942–1948 (1995). https://doi.org/10.1109/ICNN.1995.488968

  12. Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS 2017, pp. 4768–4777. Curran Associates Inc., Red Hook (2017)

    Google Scholar 

  13. Marinaki, M., Marinakis, Y.: A glowworm swarm optimization algorithm for the vehicle routing problem with stochastic demands. Expert Syst. Appl. 46, 145–163 (2016). https://doi.org/10.1016/j.eswa.2015.10.012

    Article  Google Scholar 

  14. Mirjalili, S., Gandomi, A.H., Mirjalili, S.Z., Saremi, S., Faris, H., Mirjalili, S.M.: Salp swarm algorithm: a bio-inspired optimizer for engineering design problems. Adv. Eng. Softw. 114, 163–191 (2017). https://doi.org/10.1016/j.advengsoft.2017.07.002

    Article  Google Scholar 

  15. Mirjalili, S., Mirjalili, S.M., Lewis, A.: Grey wolf optimizer. Adv. Eng. Softw. 69, 46–61 (2014). https://doi.org/10.1016/j.advengsoft.2013.12.007

    Article  Google Scholar 

  16. Martinez-de Pison, F.J., Ferreiro, J., Fraile, E., Pernia-Espinoza, A.: A comparative study of six model complexity metrics to search for parsimonious models with GAparsimony R Package. Neurocomputing 452, 317–332 (2021). https://doi.org/10.1016/j.neucom.2020.02.135

    Article  Google Scholar 

  17. Martinez-de Pison, F.J., Gonzalez-Sendino, R., Aldama, A., Ferreiro-Cabello, J., Fraile-Garcia, E.: Hybrid methodology based on Bayesian optimization and GA-parsimony to search for parsimony models by combining hyperparameter optimization and feature selection. Neurocomputing 354, 20–26 (2019). https://doi.org/10.1016/j.neucom.2018.05.136

    Article  Google Scholar 

  18. Yang, X.S.: A new metaheuristic bat-inspired algorithm. In: González, J.R., Pelta, D.A., Cruz, C., Terrazas, G., Krasnogor, N. (eds.) Nature Inspired Cooperative Strategies for Optimization (NICSO 2010). Studies in Computational Intelligence, vol. 284, pp. 65–74. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-12538-6_6

    Chapter  Google Scholar 

Download references

Acknowledgement

The work is supported by grant PID2020-116641GB-I00 and the European Regional Development Fund under Project PID2021-123219OB-I00 funded by MCIN/AEI/ 10.13039 501100011033 FEDER, UE. We are also greatly indebted to Banco Santander for the REGI2020/41 and REGI2022/60 fellowships.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Francisco Javier Martinez-de-Pison .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Divasón, J., Pernia-Espinoza, A., Romero, A., Martinez-de-Pison, F.J. (2023). Hybrid Intelligent Parsimony Search in Small High-Dimensional Datasets. In: García Bringas, P., et al. Hybrid Artificial Intelligent Systems. HAIS 2023. Lecture Notes in Computer Science(), vol 14001. Springer, Cham. https://doi.org/10.1007/978-3-031-40725-3_33

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-40725-3_33

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-40724-6

  • Online ISBN: 978-3-031-40725-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics