Skip to main content

Post-processing Evolved Decision Trees

  • Chapter
  • 706 Accesses

Part of the book series: Studies in Computational Intelligence ((SCI,volume 204))

Abstract

Although Genetic Programming (GP) is a very general technique, it is also quite powerful. As a matter of fact, GP has often been shown to outperform more specialized techniques on a variety of tasks. In data mining, GP has successfully been applied to most major tasks; e.g. classification, regression and clustering. In this chapter, we introduce, describe and evaluate a straightforward novel algorithm for post-processing genetically evolved decision trees. The algorithm works by iteratively, one node at a time, search for possible modifications that will result in higher accuracy. More specifically, the algorithm, for each interior test, evaluates every possible split for the current attribute and chooses the best. With this design, the post-processing algorithm can only increase training accuracy, never decrease it. In the experiments, the suggested algorithm is applied to GP decision trees, either induced directly from datasets, or extracted from neural network ensembles. The experimentation, using 22 UCI datasets, shows that the suggested post-processing technique results in higher test set accuracies on a large majority of the datasets. As a matter of fact, the increase in test accuracy is statistically significant for one of the four evaluated setups, and substantial on two out of the other three.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Lu, H., Setino, R., Liu, H.: Neurorule: A connectionist approach to data mining. In: International Very Large Databases Conference, pp. 478–489 (1995)

    Google Scholar 

  2. Craven, M., Shavlik, J.: Extracting Tree-Structured Representations of Trained Networks. In: Advances in Neural Information Processing Systems, vol. 8, pp. 24–30 (1996)

    Google Scholar 

  3. Andrews, R., Diederich, J., Tickle, A.B.: A survey and critique of techniques for extracting rules from trained artificial neural networks. Knowledge-Based Systems 8(6) (1995)

    Google Scholar 

  4. Craven, M., Shavlik, J.: Rule Extraction: Where Do We Go from Here? University of Wisconsin Machine Learning Research Group working Paper, 99-1 (1999)

    Google Scholar 

  5. Johansson, U., König, R., Niklasson, L.: Rule Extraction from Trained Neural Networks using Genetic Programming. In: 13th International Conference on Artificial Neural Networks, Istanbul, Turkey, supplementary proceedings, pp. 13–16 (2003)

    Google Scholar 

  6. König, R., Johansson, U., Niklasson, L.: G-REX: A Versatile Framework for Evolutionary Data Mining. IEEE International Conference on Data Mining (ICDM 2008), Demo paper, Pisa, Italy (in press) (2008)

    Google Scholar 

  7. Johansson, U.: Obtaining accurate and comprehensible data mining models: An evolutionary approach, PhD thesis, Institute of Technology, Linköping University (2007)

    Google Scholar 

  8. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)

    Google Scholar 

  9. Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees, Wadsworth International (1984)

    Google Scholar 

  10. Tsakonas, A.: A comparison of classification accuracy of four genetic programming-evolved intelligent structures. Information Sciences 176(6), 691–724 (2006)

    Article  Google Scholar 

  11. Bojarczuk, C.C., Lopes, H.S., Freitas, A.A.: Data Mining with Constrained-syntax Genetic Programming: Applications in Medical Data Sets. In: Intelligent Data Analysis in Medicine and Pharmacology - a workshop at MedInfo 2001 (2001)

    Google Scholar 

  12. Eggermont, J., Kok, J., Kosters, W.A.: Genetic Programming for Data Classification: Refining the Search Space. In: 15th Belgium/Netherlands Conference on Artificial Intelligence, pp. 123–130 (2003)

    Google Scholar 

  13. Eggermont, J., Kok, J., Kosters, W.A.: Genetic Programming for Data Classification: Partitioning the Search Space. In: 19th Annual ACM Symposium on Applied Computing (SAC 2004), pp. 1001–1005 (2004)

    Google Scholar 

  14. Johansson, U., Löfström, T., König, R., Niklasson, L.: Why Not Use an Oracle When You Got One? Neural Information Processing - Letters and Reviews 10(8-9), 227–236 (2006)

    Google Scholar 

  15. Johansson, U., Löfström, T., König, R., Sönströd, C., Niklasson, L.: Rule Extraction from Opaque Models – A Slightly Different Perspective. In: 6th International Conference on Machine Learning and Applications, Orlando, FL, pp. 22–27. IEEE press, Los Alamitos (2006)

    Chapter  Google Scholar 

  16. Asuncion, A., Newman, D.J.: UCI machine learning repository (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Johansson, U., König, R., Löfström, T., Sönströd, C., Niklasson, L. (2009). Post-processing Evolved Decision Trees. In: Abraham, A., Hassanien, AE., de Carvalho, A.P.d.L.F. (eds) Foundations of Computational Intelligence Volume 4. Studies in Computational Intelligence, vol 204. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01088-0_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-01088-0_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-01087-3

  • Online ISBN: 978-3-642-01088-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics