Abstract
The search for machine learning models that generalize well with small high-dimensional datasets is a current challenge. This paper shows a specific hybrid methodology for this kind of problems combining HYB-PARSIMONY and Bayesian Optimization. The methodology proposes to use HYB-PARSIMONY with different random seeds and select those features that had the highest mean probability. Subsequently, with these features, a hyperparameter adjustment is performed with Bayesian Optimization. The results show that the methodology substantially improves the degree of generalization and parsimony of the obtained models compared to previous methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
HYB-PARSIMONY is available for Python at https://github.com/jodivaso/HYBparsimony.
- 2.
The total number of experiments was 115170, resulting from all combinations.
- 3.
As can be seen in Fig. 5.
References
Antonanzas-Torres, F., Urraca, R., Antonanzas, J., Fernandez-Ceniceros, J., Martinez-de Pison, F.J.: Generation of daily global solar irradiation with support vector machines for regression. Energy Convers. Manage. 96, 277–286 (2015). https://doi.org/10.1016/j.enconman.2015.02.086
Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2016, pp. 785–794. ACM, New York (2016). https://doi.org/10.1145/2939672.2939785
Chuang, L.Y., Tsai, S.W., Yang, C.H.: Improved binary particle swarm optimization using catfish effect for feature selection. Expert Syst. Appl. 38(10), 12699–12707 (2011). https://doi.org/10.1016/j.eswa.2011.04.057
Divasón, J., Pernia-Espinoza, A., Martinez-de Pison, F.J.: New hybrid methodology based on particle swarm optimization with genetic algorithms to improve the search of parsimonious models in high-dimensional databases. In: García Bringas, P., et al. (eds.) HAIS 2022. LNCS, vol. 13469, pp. 335–347. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-15471-3_29
Divasón, J., Fernandez-Ceniceros, J., Sanz-Garcia, A., Pernia-Espinoza, A., Martinez-de Pison, F.J.: PSO-PARSIMONY: a method for finding parsimonious and accurate machine learning models with particle swarm optimization. Application for predicting force-displacement curves in T-stub steel connections. Neurocomputing 548, 126414 (2023). https://doi.org/10.1016/j.neucom.2023.126414
Dorogush, A.V., Ershov, V., Gulin, A.: CatBoost: gradient boosting with categorical features support (2018)
Dulce-Chamorro, E., de Pison, F.J.M.: An advanced methodology to enhance energy efficiency in a hospital cooling-water system. J. Build. Eng. 43, 102839 (2021). https://doi.org/10.1016/j.jobe.2021.102839
Erickson, N., et al.: AutoGluon-tabular: robust and accurate AutoML for structured data. arXiv preprint arXiv:2003.06505 (2020)
Karaboga, D., Basturk, B.: Artificial bee colony (ABC) optimization algorithm for solving constrained optimization problems. In: Melin, P., Castillo, O., Aguilar, L.T., Kacprzyk, J., Pedrycz, W. (eds.) IFSA 2007. LNCS (LNAI), vol. 4529, pp. 789–798. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-72950-1_77
Ke, G., et al.: LightGBM: a highly efficient gradient boosting decision tree. Adv. Neural. Inf. Process. Syst. 30, 3146–3154 (2017)
Kennedy, J., Eberhart, R.: Particle swarm optimization. In: Proceedings of ICNN 1995 - International Conference on Neural Networks, vol. 4, pp. 1942–1948 (1995). https://doi.org/10.1109/ICNN.1995.488968
Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS 2017, pp. 4768–4777. Curran Associates Inc., Red Hook (2017)
Marinaki, M., Marinakis, Y.: A glowworm swarm optimization algorithm for the vehicle routing problem with stochastic demands. Expert Syst. Appl. 46, 145–163 (2016). https://doi.org/10.1016/j.eswa.2015.10.012
Mirjalili, S., Gandomi, A.H., Mirjalili, S.Z., Saremi, S., Faris, H., Mirjalili, S.M.: Salp swarm algorithm: a bio-inspired optimizer for engineering design problems. Adv. Eng. Softw. 114, 163–191 (2017). https://doi.org/10.1016/j.advengsoft.2017.07.002
Mirjalili, S., Mirjalili, S.M., Lewis, A.: Grey wolf optimizer. Adv. Eng. Softw. 69, 46–61 (2014). https://doi.org/10.1016/j.advengsoft.2013.12.007
Martinez-de Pison, F.J., Ferreiro, J., Fraile, E., Pernia-Espinoza, A.: A comparative study of six model complexity metrics to search for parsimonious models with GAparsimony R Package. Neurocomputing 452, 317–332 (2021). https://doi.org/10.1016/j.neucom.2020.02.135
Martinez-de Pison, F.J., Gonzalez-Sendino, R., Aldama, A., Ferreiro-Cabello, J., Fraile-Garcia, E.: Hybrid methodology based on Bayesian optimization and GA-parsimony to search for parsimony models by combining hyperparameter optimization and feature selection. Neurocomputing 354, 20–26 (2019). https://doi.org/10.1016/j.neucom.2018.05.136
Yang, X.S.: A new metaheuristic bat-inspired algorithm. In: González, J.R., Pelta, D.A., Cruz, C., Terrazas, G., Krasnogor, N. (eds.) Nature Inspired Cooperative Strategies for Optimization (NICSO 2010). Studies in Computational Intelligence, vol. 284, pp. 65–74. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-12538-6_6
Acknowledgement
The work is supported by grant PID2020-116641GB-I00 and the European Regional Development Fund under Project PID2021-123219OB-I00 funded by MCIN/AEI/ 10.13039 501100011033 FEDER, UE. We are also greatly indebted to Banco Santander for the REGI2020/41 and REGI2022/60 fellowships.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Divasón, J., Pernia-Espinoza, A., Romero, A., Martinez-de-Pison, F.J. (2023). Hybrid Intelligent Parsimony Search in Small High-Dimensional Datasets. In: García Bringas, P., et al. Hybrid Artificial Intelligent Systems. HAIS 2023. Lecture Notes in Computer Science(), vol 14001. Springer, Cham. https://doi.org/10.1007/978-3-031-40725-3_33
Download citation
DOI: https://doi.org/10.1007/978-3-031-40725-3_33
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-40724-6
Online ISBN: 978-3-031-40725-3
eBook Packages: Computer ScienceComputer Science (R0)