Abstract
Although Genetic Programming (GP) is a very general technique, it is also quite powerful. As a matter of fact, GP has often been shown to outperform more specialized techniques on a variety of tasks. In data mining, GP has successfully been applied to most major tasks; e.g. classification, regression and clustering. In this chapter, we introduce, describe and evaluate a straightforward novel algorithm for post-processing genetically evolved decision trees. The algorithm works by iteratively, one node at a time, search for possible modifications that will result in higher accuracy. More specifically, the algorithm, for each interior test, evaluates every possible split for the current attribute and chooses the best. With this design, the post-processing algorithm can only increase training accuracy, never decrease it. In the experiments, the suggested algorithm is applied to GP decision trees, either induced directly from datasets, or extracted from neural network ensembles. The experimentation, using 22 UCI datasets, shows that the suggested post-processing technique results in higher test set accuracies on a large majority of the datasets. As a matter of fact, the increase in test accuracy is statistically significant for one of the four evaluated setups, and substantial on two out of the other three.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Lu, H., Setino, R., Liu, H.: Neurorule: A connectionist approach to data mining. In: International Very Large Databases Conference, pp. 478–489 (1995)
Craven, M., Shavlik, J.: Extracting Tree-Structured Representations of Trained Networks. In: Advances in Neural Information Processing Systems, vol. 8, pp. 24–30 (1996)
Andrews, R., Diederich, J., Tickle, A.B.: A survey and critique of techniques for extracting rules from trained artificial neural networks. Knowledge-Based Systems 8(6) (1995)
Craven, M., Shavlik, J.: Rule Extraction: Where Do We Go from Here? University of Wisconsin Machine Learning Research Group working Paper, 99-1 (1999)
Johansson, U., König, R., Niklasson, L.: Rule Extraction from Trained Neural Networks using Genetic Programming. In: 13th International Conference on Artificial Neural Networks, Istanbul, Turkey, supplementary proceedings, pp. 13–16 (2003)
König, R., Johansson, U., Niklasson, L.: G-REX: A Versatile Framework for Evolutionary Data Mining. IEEE International Conference on Data Mining (ICDM 2008), Demo paper, Pisa, Italy (in press) (2008)
Johansson, U.: Obtaining accurate and comprehensible data mining models: An evolutionary approach, PhD thesis, Institute of Technology, Linköping University (2007)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees, Wadsworth International (1984)
Tsakonas, A.: A comparison of classification accuracy of four genetic programming-evolved intelligent structures. Information Sciences 176(6), 691–724 (2006)
Bojarczuk, C.C., Lopes, H.S., Freitas, A.A.: Data Mining with Constrained-syntax Genetic Programming: Applications in Medical Data Sets. In: Intelligent Data Analysis in Medicine and Pharmacology - a workshop at MedInfo 2001 (2001)
Eggermont, J., Kok, J., Kosters, W.A.: Genetic Programming for Data Classification: Refining the Search Space. In: 15th Belgium/Netherlands Conference on Artificial Intelligence, pp. 123–130 (2003)
Eggermont, J., Kok, J., Kosters, W.A.: Genetic Programming for Data Classification: Partitioning the Search Space. In: 19th Annual ACM Symposium on Applied Computing (SAC 2004), pp. 1001–1005 (2004)
Johansson, U., Löfström, T., König, R., Niklasson, L.: Why Not Use an Oracle When You Got One? Neural Information Processing - Letters and Reviews 10(8-9), 227–236 (2006)
Johansson, U., Löfström, T., König, R., Sönströd, C., Niklasson, L.: Rule Extraction from Opaque Models – A Slightly Different Perspective. In: 6th International Conference on Machine Learning and Applications, Orlando, FL, pp. 22–27. IEEE press, Los Alamitos (2006)
Asuncion, A., Newman, D.J.: UCI machine learning repository (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Johansson, U., König, R., Löfström, T., Sönströd, C., Niklasson, L. (2009). Post-processing Evolved Decision Trees. In: Abraham, A., Hassanien, AE., de Carvalho, A.P.d.L.F. (eds) Foundations of Computational Intelligence Volume 4. Studies in Computational Intelligence, vol 204. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01088-0_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-01088-0_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01087-3
Online ISBN: 978-3-642-01088-0
eBook Packages: EngineeringEngineering (R0)