Skip to main content

Feature Construction and Dimension Reduction Using Genetic Programming

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4830))

Abstract

This paper describes a new approach to the use of genetic programming (GP) for feature construction in classification problems. Rather than wrapping a particular classifier for single feature construction as in most of the existing methods, this approach uses GP to construct multiple (high-level) features from the original features. These constructed features are then used by decision trees for classification. As feature construction is independent of classification, the fitness function is designed based on the class dispersion and entropy. This approach is examined and compared with the standard decision tree method, using the original features, and using a combination of the original features and constructed features, on 12 benchmark classification problems. The results show that the new approach outperforms the standard way of using decision trees on these problems in terms of the classification performance, dimension reduction and the learned decision tree size.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ekart, A., Markus, A.: Using genetic programming and decision trees for generating structural descriptions of four bar mechanisms. Artificial Intelligence for Engineering Design, Analysis and Manufacturing 17(3), 205–220 (2003)

    Article  Google Scholar 

  2. Muni, D.P., Pal, N.R., Das, J.: Genetic programming for simultaneous feature selection and classifier design. IEEE Transactions on Systems, Man and Cybernetics, Part B 36(1), 106–117 (2006)

    Article  Google Scholar 

  3. Krawiec, K., Bhanu, B.: Visual learning by coevolutionary feature synthesis. IEEE Transactions on System, Man, and Cybernetics – Part B 35(3), 409–425 (2005)

    Article  Google Scholar 

  4. Bhanu, B., Krawiec, K.: Coevolutionary construction of features for transformation of representation in machine learning. In: Barry, A.M. (ed.) GECCO 2002. Proceedings of the Bird of a Feather Workshops, Genetic and Evolutionary Computation Conference, pp. 249–254. AAAI, New York (2002)

    Google Scholar 

  5. Otero, F.E.B., Silva, M.M.S., Freitas, A.A., Nievola, J.C.: Genetic programming for attribute construction in data mining. In: Ryan, C., Soule, T., Keijzer, M., Tsang, E.P.K., Poli, R., Costa, E. (eds.) EuroGP 2003. LNCS, vol. 2610, pp. 384–393. Springer, Heidelberg (2003)

    Google Scholar 

  6. Kohavi, R., John, G.: Wrappers for feature subset selection. Artificial Intelligence, 273–324 (1997)

    Google Scholar 

  7. Smith, M.G., Bull, L.: Genetic programming with a genetic algorithm for feature construction and selection. Genetic Programming and Evolvable Machines 6(3), 265–281 (2005)

    Article  Google Scholar 

  8. Krawiec, K.: Genetic programming-based construction of features for machine learning and knowledge discovery tasks. Genetic Programming and Evolvable Machines 3(4), 329–343 (2002)

    Article  MATH  Google Scholar 

  9. Muharram, M.A., Smith, G.D.: Evolutionary feature construction using information gain and gini index. In: Keijzer, M., O’Reilly, U.M., Lucas, S.M., Costa, E., Soule, T. (eds.) EuroGP 2004. LNCS, vol. 3003, pp. 379–388. Springer, Heidelberg (2004)

    Google Scholar 

  10. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)

    Google Scholar 

  11. Kreyszig, E.: Advanced Engineering Mathematics, 8th edn. John Wiley, Chichester (1999)

    Google Scholar 

  12. Mitchell, T.: Machine Learning. McGraw-Hill, New York (1997)

    MATH  Google Scholar 

  13. Koza, J.R.: Genetic Programming. MIT Press, Cambridge (1992)

    MATH  Google Scholar 

  14. Asuncion, A.D.N.: UCI machine learning repository (2007)

    Google Scholar 

  15. Silva, S., Almeida, J.: Gplab - a genetic programming toolbox for matlab. In: Proceedings of the Nordic MATLAB Conference, pp. 273–278 (2003)

    Google Scholar 

  16. Silva, S., Costa, E.: Dynamic limits for bloat control: Variations on size and depth. In: Deb, K., et al. (eds.) GECCO 2004. LNCS, vol. 3103, pp. 666–677. Springer, Heidelberg (2004)

    Google Scholar 

  17. Davis, L.: Adapting operator probabilities in genetic algorithms. In: Proceedings of the Third International Conference on Genetic Algorithms, pp. 70–79. Morgan Kaufman, San Francisco (1989)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Mehmet A. Orgun John Thornton

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Neshatian, K., Zhang, M., Johnston, M. (2007). Feature Construction and Dimension Reduction Using Genetic Programming. In: Orgun, M.A., Thornton, J. (eds) AI 2007: Advances in Artificial Intelligence. AI 2007. Lecture Notes in Computer Science(), vol 4830. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-76928-6_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-76928-6_18

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-76926-2

  • Online ISBN: 978-3-540-76928-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics