skip to main content
10.1145/1276958.1277291acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
Article

Improving the human readability of features constructed by genetic programming

Published: 07 July 2007 Publication History

Abstract

The use of machine learning techniques to automatically analyse data for information is becoming increasingly widespread. In this paper we examine the use of Genetic Programming and a Genetic Algorithm to pre-process data before it is classified by an external classifier. Genetic Programming is combined with a Genetic Algorithm to construct and select new features from those available in the data, a potentially significant process for data mining since it gives consideration to hidden relationships between features. We then examine techniques to improve the human readability of these new features and extract more information about the domain.

References

[1]
Aha, D., & Kibler, D. Instance-based learning algorithms. Machine Learning vol.6, 1991, 37--66.
[2]
Ahluwalia, M. & Bull, L. Co-Evolving Functions in Genetic Programming: Classification using k-nearest neighbour. In GECCO-99: Proceedings of the Genetic and Evolutionary Computation Conference. Morgan Kaufmann, 1999 pp. 947--952.
[3]
Bernstein, Y., Li, X., Ciesielski, V., Song, A.: Multiobjective parsimony enforcement for superior generalisation performance. In: Proceedings of the Congress for Evolutionary Computation 2004 (CEC'04), 2004 pp. 83--89.
[4]
Bojarczuk, C.C., Lopes, H.S., Freitas, A.A., Michalkiewicz, E.L., A constrained-syntax genetic programming system for discovering classification rules: application to medical data sets, Artificial Intelligence in Medicine 30 (1), 2004, 21--48.
[5]
De Jong, E. D., Watson, R. A., Pollack, J. B. Reducing Bloat and Promoting Diversity using Multi-Objective Methods. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2001), 2001, pp. 11--18.
[6]
Ekárt, A. & Máárkus, A. Using Genetic Programming and Decision Trees for Generating Structural Descriptions of Four Bar Mechanisms. In Artificial Intelligence for Engineering Design, Analysis and Manufacturing, volume 17, issue 3, 2003.
[7]
Garcia-Almanza, A.L., Tsang, E.P.K. Simplifying Decision Trees Learned by Genetic Programming. IEEE Congress on Evolutionary Computation, CEC 2006, pp 2142-- 2148.
[8]
Holland, J.H. Adaptation in Natural and Artificial Systems. Univ. Michigan. 1975.
[9]
John, G.H & Langley, P. Estimating Continuous Distributions in Bayesian Classifiers. Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence. Morgan Kaufmann, San Mateo. 1995, 338--345.
[10]
Kelly, J.D. & Davis, L. Hybridizing the Genetic Algorithm and the K Nearest Neighbors Classification Algorithm. In Proceedings of the Fourth International Conference on Genetic Algorithms. Morgan Kaufmann, 1991, pp377--383.
[11]
Kohavi, R. & John, G. H. Wrappers for feature subset selection. Artificial Intelligence Journal vol. 1-2: 273--324. 1997.
[12]
Koza, J.R. Genetic Programming. MIT Press. 1992.
[13]
Krawiec, K. Genetic Programming-based Construction of Features for Machine Learning and Knowledge Discovery Tasks. Genetic Programming and Evolvable Machines vol. 3 no. 4: 329--343. 2002.
[14]
Langdon, W. B. & Buxton, B. F. Genetic programming for improved receiver operating characteristics. In Second International Conference on Multiple Classifier System, volume 2096: 68--77. 2001.
[15]
Mitchell, T. M. Machine Learning. McGraw-Hill, 1997.
[16]
Otero, F. E. B., Silva, M. M. S., Freitas, A. A. & Nievola J. C. Genetic Programming for Attribute Construction in Data Mining. In Genetic Programming: 6th European Conference, EuroGP 2003, Essex, UK, April 2003, Proceedings. Springer, pp. 384--393.
[17]
Quinlan, J.R. C4.5: Programs for Machine Learning. Morgan Kaufmann. 1993.
[18]
Raymer, M.L., Punch, W., Goodman, E.D. & Kuhn, L. Genetic Programming for Improved Data Mining -- Application to the Biochemistry of Protein Interactions. In Proceedings of the Second Annual Conference on Genetic Programming, Morgan Kaufmann, 1996, 375--380.
[19]
Siedlecki, W. & Sklansky, J. On Automatic Feature Selection. International Journal of Pattern Recognition and Artificial Intelligence 2:197--220. 1988.
[20]
Smith, M. & Bull, L. Using Genetic Programming for Feature Creation with a Genetic Algorithm Feature Selector. In Parallel Problem Solving from Nature -- PPSN VIII, X. Springer-Verlag, 2004.
[21]
Smith, M. & Bull, L. Genetic Programming with a Genetic Algorithm for Feature Construction and Selection. Genetic Programming and Evolvable Machines vol. 6 no. 3: 265--281. 2005
[22]
Thomas, J. & Sycara, K. The Importance of Simplicity and Validation in Genetic Programming for Data Mining in Financial Data. Proceedings of the joint AAAI-1999 and GECCO-1999 Workshop on Data Mining with Evolutionary Algorithms, July, 1999.
[23]
Vafaie, H. & De Jong, K. Genetic Algorithms as a Tool for Restructuring Feature Space Representations. In Proceedings of the International Conference on Tools with A.I. IEEE Computer Society Press. 1995.
[24]
Witten, I.H. & Frank, E. Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann. 2000.

Cited By

View all
  • (2023)Multi-Objective Multi-Gene Genetic Programming for the Prediction of Leakage in Water Distribution NetworksProceedings of the Genetic and Evolutionary Computation Conference10.1145/3583131.3590499(1357-1364)Online publication date: 15-Jul-2023
  • (2023)An Efficient Iterative Approach to Explainable Feature LearningIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2021.310704934:5(2606-2618)Online publication date: May-2023
  • (2022)Evolving Interpretable Classification Models via Readability-Enhanced Genetic Programming2022 IEEE Symposium Series on Computational Intelligence (SSCI)10.1109/SSCI51031.2022.10022164(1691-1697)Online publication date: 4-Dec-2022
  • Show More Cited By

Index Terms

  1. Improving the human readability of features constructed by genetic programming

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    GECCO '07: Proceedings of the 9th annual conference on Genetic and evolutionary computation
    July 2007
    2313 pages
    ISBN:9781595936974
    DOI:10.1145/1276958
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 July 2007

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. classification
    2. feature construction
    3. feature selection
    4. genetic algorithm
    5. genetic programming
    6. human readability
    7. knowledge discovery
    8. parsimony
    9. post-processing

    Qualifiers

    • Article

    Conference

    GECCO07
    Sponsor:

    Acceptance Rates

    GECCO '07 Paper Acceptance Rate 266 of 577 submissions, 46%;
    Overall Acceptance Rate 1,669 of 4,410 submissions, 38%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)1
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 16 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Multi-Objective Multi-Gene Genetic Programming for the Prediction of Leakage in Water Distribution NetworksProceedings of the Genetic and Evolutionary Computation Conference10.1145/3583131.3590499(1357-1364)Online publication date: 15-Jul-2023
    • (2023)An Efficient Iterative Approach to Explainable Feature LearningIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2021.310704934:5(2606-2618)Online publication date: May-2023
    • (2022)Evolving Interpretable Classification Models via Readability-Enhanced Genetic Programming2022 IEEE Symposium Series on Computational Intelligence (SSCI)10.1109/SSCI51031.2022.10022164(1691-1697)Online publication date: 4-Dec-2022
    • (2018)Have your spaghetti and eat it tooGenetic Programming and Evolvable Machines10.1007/s10710-010-9122-112:2(121-160)Online publication date: 24-Dec-2018

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media