Abstract
Pruning is a major research field in neural networks, enhancing their efficiency and generalization. The field of pruning approaches in genetic programming (GP) is continually evolving, with researchers actively exploring new techniques and approaches to optimise the performance of GP models. This research introduces a novel pruning algorithm for Genetic Programming-based Symbolic Regression (GPSR). The proposed method employs a weighting mechanism to identify and filter out unimportant subtrees in each generation. To achieve this, the method arranges all subtrees linearly and assigns weights to each subtree and terminal. It then uses Ordinary Least Squares (OLS) to optimize these weights, enabling the identification of unimportant subtrees and terminals for effective pruning. The algorithm’s effectiveness was evaluated using ten regression datasets, including high-dimensional and complex feature sets. Furthermore, comparisons were made with two other algorithms to evaluate its performance. The results indicate that the proposed approach not only achieves better learning and generalisation performance but also generates smaller trees compared to standard GP, thereby improving interpretability.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
EUR-Lex - 32016R0679 - EN - EUR-Lex, https://eur-lex.europa.eu/eli/reg/2016/679/oj, doc ID: 32016R0679 Doc Sector: 3 Doc Title: Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation) (Text with EEA relevance) Doc Type: R Usr_lan: en
scikit-learn: machine learning in Python — scikit-learn 1.4.0 documentation. https://scikit-learn.org/stable/
Tech Note ECS Grid - Support | ECS | Victoria University of Wellington. https://ecs.wgtn.ac.nz/Support/TechNoteEcsGrid
Al-Helali, B., Chen, Q., Xue, B., Zhang, M.: Genetic programming for feature selection based on feature removal impact in high-dimensional symbolic regression. IEEE Trans. Emerging Top. Comput. Intell., 1–14 (2024)
Alfaro-Cid, E., Esparcia-Alcázar, A., Sharman, K., Vega, F.F.D.: Prune and Plant: a new bloat control method for genetic programming. In: 2008 Eighth International Conference on Hybrid Intelligent Systems, pp. 31–35 (2008)
Castelli, M., Gonçalves, I., Manzoni, L., Vanneschi, L.: Pruning Techniques for Mixed Ensembles of Genetic Programming Models (2018)
Chen, Q., Xue, B., Browne, W., Zhang, M.: Evolutionary regression and modelling. In: Banzhaf, W., Machado, P., Zhang, M. (eds.) Handbook of Evolutionary Machine Learning, pp. 121–149. Springer Nature Singapore, Singapore (2024). https://doi.org/10.1007/978-981-99-3814-8_5
Kinzett, D., Johnston, M., Zhang, M.: Numerical simplification for bloat control and analysis of building blocks in genetic programming. Evol. Intell. 2(4), 151–168 (2009)
Koza, J.R.: Genetic programming as a means for programming computers by natural selection. Stat. Comput. 4, 87–112 (1994)
Luke, S., Panait, L.: A comparison of bloat control methods for genetic programming. Evol. Comput. 14(3), 309–344 (2006)
Planinić, L., Đurasević, M., Picek, S., Jakobovic, D.: Building the building blocks: from simplification to winning trees in genetic programming (2022)
Poli, R., Langdon, W., Mcphee, N.: A Field Guide to Genetic Programming (2008)
Rimas, M., Chen, Q., Zhang, M.: Bloating reduction in symbolic regression through function frequency-based tree substitution in genetic programming. In: Liu, T., Webb, G., Yue, L., Wang, D. (eds.) AI 2023: Advances in Artificial Intelligence: 36th Australasian Joint Conference on Artificial Intelligence, AI 2023, Brisbane, QLD, Australia, November 28–December 1, 2023, Proceedings, Part II, pp. 429–440. Springer Nature Singapore, Singapore (2024). https://doi.org/10.1007/978-981-99-8391-9_34
Rockett, P.: Pruning of genetic programming trees using permutation tests. Evol. Intel. 13(4), 649–661 (2020)
Silva, S., Costa, E.: Dynamic limits for bloat control in genetic programming and a review of past and current bloat theories. Genetic Programm. Evol. Mach. 10(2), 141–179 (2009)
Silva, S., Dignum, S., Vanneschi, L.: Operator equalisation for bloat free genetic programming and a survey of bloat control methods. Genet. Program Evolvable Mach. 13, 197–238 (2011)
Uy, N.Q., Chu, T.H.: Semantic approximation for reducing code bloat in genetic programming. Swarm Evol. Comput. 58, 100729 (2020)
Zhang, H., Chen, Q., Xue, B., Banzhaf, W., Zhang, M.: Modular multi-tree genetic programming for evolutionary feature construction for regression. IEEE Trans. Evol. Comput., 1 (2023)
Acknowledgement
This work is supported in part by the Marsden Fund of New Zealand Government under Contract MFP-VUW2016 and MFP-VUW1913.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Rimas, M., Chen, Q., Zhang, M. (2025). Importance-Based Pruning for Genetic Programming Based Symbolic Regression. In: Gong, M., Song, Y., Koh, Y.S., Xiang, W., Wang, D. (eds) AI 2024: Advances in Artificial Intelligence. AI 2024. Lecture Notes in Computer Science(), vol 15443. Springer, Singapore. https://doi.org/10.1007/978-981-96-0351-0_14
Download citation
DOI: https://doi.org/10.1007/978-981-96-0351-0_14
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-96-0350-3
Online ISBN: 978-981-96-0351-0
eBook Packages: Computer ScienceComputer Science (R0)