Skip to main content
Log in

Operator equalisation for bloat free genetic programming and a survey of bloat control methods

Genetic Programming and Evolvable Machines Aims and scope Submit manuscript

Abstract

Bloat can be defined as an excess of code growth without a corresponding improvement in fitness. This problem has been one of the most intensively studied subjects since the beginnings of Genetic Programming. This paper begins by briefly reviewing the theories explaining bloat, and presenting a comprehensive survey and taxonomy of many of the bloat control methods published in the literature through the years. Particular attention is then given to the new Crossover Bias theory and the bloat control method it inspired, Operator Equalisation (OpEq). Two implementations of OpEq are described in detail. The results presented clearly show that Genetic Programming using OpEq is essentially bloat free. We discuss the advantages and shortcomings of each different implementation, and the unexpected effect of OpEq on overfitting. We observe the evolutionary dynamics of OpEq and address its potential to be extended and integrated into different elements of the evolutionary process.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19

Similar content being viewed by others

Notes

  1. Of course selection distorts this distribution, but the crossover (length) bias is not a minor effect if considered with selection: in fact, according to the reported theory, it is exactly the crossover bias that, together with selection, causes average program size to grow.

  2. However, the term crossover bias is now widely used, so we are not proposing a change in terminology, but only providing a better definition and further clarification.

  3. ECJ—Evolutionary Computation in Java, http://cs.gmu.edu/~eclab/projects/ecj/

  4. The number of time steps actually used in the original work by Koza [47] was 600, but a typographical error caused the number 400 to become more popular in the literature, the reason why we also use it.

  5. GPLAB—A Genetic Programming Toolbox for MATLAB, http://gplab.sourceforge.net.

References

  1. E. Alfaro-Cid, A. Esparcia-Alcazar, K. Sharman, F.F. de Vega, J.J. Merelo, Prune and plant: a new bloat control method for genetic programming, in Proceedings of the 8th International Conference on Hybrid Intelligent Systems (IEEE Press, Piscataway, 2008), pp. 31–35

  2. N.M.A. Al Salami, Genetic programming under theoretical definition. Int. J. Softw. Eng. Appl. 3(4), 51–64 (2009)

    Google Scholar 

  3. L. Altenberg, The evolution of evolvability in genetic programming, in Advances in Genetic Programming, ed. by K.E. Kinnear Jr. (MIT Press, Cambridge, 1994), pp. 47–74

    Google Scholar 

  4. P.J. Angeline, Genetic programming and emergent intelligence, in Advances in Genetic Programming, ed. by K.E. Kinnear Jr. (MIT Press, Cambridge, 1994), pp. 75–98

    Google Scholar 

  5. P.J. Angeline, Two self-adaptive crossover operators for genetic programming, in Advances in Genetic Programming 2, ed. by P.J. Angeline, K.E. Kinnear Jr. (MIT Press, Cambridge, 1996), pp. 89–110

    Google Scholar 

  6. P.J. Angeline, A historical perspective on the evolution of executable structures. Fundam. Informaticae 35(1–4), 179–195 (1998)

    MATH  Google Scholar 

  7. P.J. Angeline, J.B. Pollack, Coevolving high-level representations, in Proceedings of Artificial Life III, ed. by C.G. Langton (Addison-Wesley, Reading, 1994), pp. 55–71

    Google Scholar 

  8. F. Archetti, S. Lanzeni, E. Messina, L. Vanneschi, Genetic programming and other machine learning approaches to predict median oral lethal dose (LD50) and plasma protein binding levels (%PPB) of drugs, in Proceedings of EvoBIO-2007, ed. by E. Marchiori et al. (Springer, Berlin, 2007), pp. 11–23

    Google Scholar 

  9. F. Archetti, E. Messina, S. Lanzeni, L. Vanneschi, Genetic programming for computational pharmacokinetics in drug discovery and development. Genet. Program. Evolvable Mach. 8(4), 17–26 (2007)

    Article  Google Scholar 

  10. W. Banzhaf, P. Nordin, R.E. Keller, F.D. Francone, Genetic Programming—An Introduction (dpunkt.verlag and Morgan Kaufmann, San Francisco, 1998)

    MATH  Google Scholar 

  11. W. Banzhaf, F.D. Francone, P. Nordin, Some emergent properties of variable size EAs. Position paper at the workshop on evolutionary computation with variable size representation at ICGA-97 (1997)

  12. L. Beadle, C. G. Johnson, Semantically Driven Crossover in Genetic Programming, in IEEE World Congress on Computational Intelligence (IEEE Press, Piscataway, 2008), pp. 111–116

  13. S. Bleuler, M. Brack, L. Thiele, E. Zitzler, Multiobjective genetic programming: reducing bloat using SPEA2, in Proceedings of CEC-2001 (IEEE Press, Piscataway, 2001), pp. 536–543

  14. T. Blickle, Theory of evolutionary algorithms and applications to system design. PhD thesis, Swiss Federal Institute of Technology, Computer Engineering and Networks Laboratory (1996)

  15. T. Blickle, Evolving compact solutions in genetic programming: a case study, in Proceedings of Parallel Problem Solving From Nature IV, ed. by H.-M. Voigt et al. (Springer, Berlin, 1996), pp. 564–573

    Chapter  Google Scholar 

  16. T. Blickle, L. Thiele, Genetic programming and redundancy, in Genetic Algorithms within the Framework of Evolutionary Computation, ed. by J. Hopf (Max-Planck-Institut für Informatik, Germany, 1994), pp. 33–38

    Google Scholar 

  17. M. Brameier, W. Banzhaf, A comparison of linear genetic programming and neural networks in medical data mining. IEEE Trans. Evol. Comput. 5(1), 17–26 (2001)

    Article  Google Scholar 

  18. M. Brameier, W. Banzhaf, Neutral variations cause bloat in linear GP, in Proceedings of EuroGP-2003, ed. by C. Ryan et al. (Springer, Berlin, 2003) pp. 286–296

    Google Scholar 

  19. J. Cuendet, Populations dynamiques en programmation génétique. MSc thesis, Université de Lausanne, Université de Genève (2004)

  20. L.E. Da Costa, J.A. Landry, Relaxed genetic programming, in Proceedings of GECCO-2006, ed. by M. Keijzer et al. (ACM Press, New York, 2006), pp. 937–938

    Chapter  Google Scholar 

  21. E.D. De Jong, J.B. Pollack, Multi-objective methods for tree size control. Genet. Program. Evolvable Mach. 4(3), 211–233 (2003)

    Article  Google Scholar 

  22. E.D. De Jong, R.A. Watson, J.B. Pollack, Reducing bloat and promoting diversity using multi-objective methods, in Proceedings of GECCO-2001, ed. by L. Spector et al. (Morgan Kaufmann, San Francisco, 2001), pp. 11–18

    Google Scholar 

  23. P. D’haeseleer, Context preserving crossover in genetic programming, in Proceedings of the 1994 IEEE World Congress on Computational Intelligence (IEEE Press, Piscataway, 1994), pp. 256–261

  24. S. Dignum, R. Poli, Generalisation of the limiting distribution of program sizes in tree-based genetic programming and analysis of its effects on bloat, in Proceedings of GECCO-2007, ed. by D. Thierens et al. (ACM Press, New York, 2007), pp. 1588–1595

    Chapter  Google Scholar 

  25. S. Dignum, R. Poli, Operator equalisation and bloat free GP, in Proceedings of EuroGP-2008, ed. by M. O’Neill et al. (Springer, Berlin, 2008), pp. 110–121

    Google Scholar 

  26. S. Dignum, R. Poli, Crossover, sampling, bloat and the harmful effects of size limits, in Proceedings of EuroGP-2008, ed. by M. O’Neill et al. (Springer, Berlin, 2008), pp. 158–169

    Google Scholar 

  27. S. Dignum, R. Poli, Sub-tree swapping crossover and arity histogram distributions, in Proceedings of EuroGP-2010, ed. by A.I. Esparcia-Alcázar et al. (Springer, Berlin, 2010), pp. 38–49

    Google Scholar 

  28. P. Domingos, The role of Occam’s razor in knowledge discovery. Data Min. Knowl. Discov. 3(4), 409–425 (1999)

    Article  Google Scholar 

  29. A. Ekart, Shorter fitness preserving genetic programs, in Proceedings of AE-1999, ed. by C. Fonlupt et al. (Springer, Berlin, 2000), pp. 73–83

    Google Scholar 

  30. A. Ekart, S.Z. Németh, Selection based on the pareto nondomination criterion for controlling code growth in genetic programming. Genet. Program. Evolvable Mach. 2(1), 61–73 (2001)

    Article  MATH  Google Scholar 

  31. F. Fernandez, L. Vanneschi, M. Tomassini, The effect of plagues in genetic programming: a study of variable-size populations, in Proceedings of EuroGP-2003, ed. by C. Ryan et al. (Springer, Berlin, 2003), pp. 317–326

    Google Scholar 

  32. F. Fernandez, M. Tomassini, L. Vanneschi, Saving computational effort in genetic programming by means of plagues, in Proceedings of CEC-2003, ed. by R. Sarker et al. (IEEE Press, Piscataway, 2003), pp. 2042–2049

    Google Scholar 

  33. A.A. Freitas, Data Mining and Knowledge Discovery with Evolutionary Algorithms (Springer, Berlin, 2002)

    MATH  Google Scholar 

  34. C. Gathercole, P. Ross, An adverse interaction between crossover and restricted tree depth in genetic programming, in Proceedings of GP’96, ed. by J.R. Koza et al. (MIT Press, Cambridge, 1996), pp. 291–296

    Google Scholar 

  35. T. Haynes, Collective adaptation: the exchange of coding segments. Evol. Comput. 6(4), 311–338 (1998)

    Article  Google Scholar 

  36. M.I. Heywood, A.N. Zincir-Heywood, Dynamic page-based crossover in linear genetic programming. IEEE Trans. Syst. Man Cybern. Part B Cybern. 32(3), 380–388 (2002)

    Article  Google Scholar 

  37. D. Hooper, N.S. Flann, Improving the accuracy and robustness of genetic programming through expression simplification, in Proceedings of GP’96, ed. by J.R. Koza et al. (MIT Press, Cambridge, 1996), p. 428

    Google Scholar 

  38. K. Krawiec, Semantically embedded genetic programming: automated design of abstract program representations, in Proceedings of GECCO-2011, ed. by N. Krasnogor et al. (ACM Press, New York, 2011), pp. 1379–1386

    Google Scholar 

  39. H. Iba, de H. Garis, T. Sato, Genetic programming using a minimum description length principle, in Advances in Genetic Programming, ed. by K.E. Kinnear Jr. (MIT Press, Cambridge, 1994), pp. 265–284

    Google Scholar 

  40. H. Iba, M. Terao, Controlling effective introns for multi-agent learning by genetic programming, in Proceedings of GECCO-2000, ed. by D. Whitley et al. (Morgan Kaufmann, San Francisco, 2000), pp. 419–426

    Google Scholar 

  41. C. Igel, K. Chellapilla, Investigating the influence of depth and degree of genotypic change on fitness in genetic programming, in Proceedings of GECCO-1999, ed. by W. Banzhaf et al. (Morgan Kaufmann, San Francisco, 1999), pp. 1061–1068

    Google Scholar 

  42. K. Janardan, Weighted Lagrange distributions and their characterizations. SIAM J. Appl. Math. 47(2), 411–415 (1987)

    Article  MathSciNet  MATH  Google Scholar 

  43. K. Janardan, B. Rao, Lagrange distributions of the second kind and weighted distributions. SIAM J. Appl. Math. 43(2), 302–313 (1983)

    Article  MathSciNet  MATH  Google Scholar 

  44. C.J. Kennedy, C. Giraud-Carrier, A depth controlling strategy for strongly typed evolutionary programming, in Proceedings of GECCO-1999, ed. by W. Banzhaf et al. (Morgan Kaufmann, San Francisco, 1999), pp. 879–885

    Google Scholar 

  45. K.E. Kinnear Jr., Generality and difficulty in genetic programming: evolving a sort, in Proceedings of ICGA’93, ed. by S. Forrest (Morgan Kaufmann, San Francisco, 1993), pp. 287–294

    Google Scholar 

  46. D. Kinzett, M. Zhang, M. Johnston, Using numerical simplification to control bloat in genetic programming, in Proceedings of SEAL-2008 (Springer, 2008), pp. 493–502

  47. J.R. Koza, Genetic Programming—on the Programming of Computers by means of Natural Selection (MIT Press, Cambridge, 1992)

    MATH  Google Scholar 

  48. J.R. Koza, Genetic Programming II—Automatic Discovery of Reusable Programs (MIT Press, Cambridge, 1994)

    MATH  Google Scholar 

  49. J.R. Koza, F.H. Bennett III, D. Andre, M.A. Keane, Genetic Programming III—Darwinian Invention and Problem Solving (Morgan Kaufmann, San Francisco, 1999)

    MATH  Google Scholar 

  50. W.B. Langdon, Genetic Programming + Data Structures = Automatic Programming! (Kluwer Academic Publishers, Boston, 1998)

    Book  Google Scholar 

  51. W.B. Langdon, The evolution of size in variable length representations, in Proceedings of the 1998 IEEE International Conference on Evolutionary Computation (IEEE Press, Piscataway, 1998), pp. 633–638

  52. W.B. Langdon, Genetic programming bloat with dynamic fitness, in Proceedings of EuroGP-1998, ed. by W. Banzhaf et al. (Springer, Berlin, 1998), pp. 96–112

    Google Scholar 

  53. W.B. Langdon, Size fair and homologous tree genetic programming crossovers, in Proceedings of GECCO-1999, ed. by W. Banzhaf et al. (Morgan Kaufmann, San Francisco, 1999), pp. 1092–1097

    Google Scholar 

  54. W.B. Langdon, Size fair and homologous tree genetic programming crossovers. Genet. Program. Evolvable Mach. 1(1/2), 95–119 (2000)

    Article  MATH  Google Scholar 

  55. W.B. Langdon, Quadratic bloat in genetic programming, in Proceedings of GECCO-2000, ed. by D. Whitley et al. (Morgan Kaufmann, San Francisco, 2000), pp. 451–458

    Google Scholar 

  56. W.B. Langdon, J.P. Nordin, Seeding GP populations, in Proceedings of EuroGP-2000, ed. by R. Poli et al. (Springer, Berlin, 2000), pp. 304–315

    Google Scholar 

  57. W.B. Langdon, R. Poli, Fitness causes bloat, in Proceedings of the Second On-line World Conference on Soft Computing in Engineering Design and Manufacturing, ed. by P.K. Chawdhry et al. (Springer, Berlin, 1997), pp. 13–22

    Google Scholar 

  58. W.B. Langdon, R. Poli, An analysis of the MAX problem in genetic programming, in Proceedings of GP’97, ed. by J.R. Koza et al. (Morgan Kaufman, San Francisco, 1997), pp. 222–230

    Google Scholar 

  59. W.B. Langdon, R. Poli, Fitness causes bloat: mutation, in Proceedings of EuroGP’98, ed. by W. Banzhaf et al. (Springer, Berlin, 1998), pp. 37–48

    Google Scholar 

  60. W.B. Langdon, R. Poli, Foundations of Genetic Programming (Springer, Berlin, 2002)

    MATH  Google Scholar 

  61. W.B. Langdon, T. Soule, R. Poli, J.A. Foster, The evolution of size and shape, in Advances in Genetic Programming 3, ed. by L. Spector et al. (MIT Press, Cambridge, 1999), pp. 163–190

    Google Scholar 

  62. W.B. Langdon, W. Banzhaf, Genetic programming bloat without semantics, in Proceedings of PPSN-2000, ed. by M. Schoenauer et al. (Springer, Berlin, 2000), pp. 201–210

    Google Scholar 

  63. S. Luke, Code growth is not caused by introns, in Late Breaking Papers at GECCO-2000 (2000), pp. 228–235

  64. S. Luke, Issues in scaling genetic programming: breeding strategies, tree generation, and code bloat. PhD thesis, Department of Computer Science, University of Maryland (2000)

  65. S. Luke, G.C. Balan, L. Panait, Population implosion in genetic programming, in Proceedings of GECCO-2003, ed. by E. Cantú-Paz et al. (Springer, Berlin, 2003), pp. 1729–1739

    Google Scholar 

  66. S. Luke, Modification point depth and genome growth in genetic programming. Evol. Comput. 11(1), 67–106 (2003)

    Article  Google Scholar 

  67. S. Luke, Evolutionary computation and the C-value paradox, in Proceedings of GECCO-2005, ed. by H.-G. Beyer et al. (ACM Press, New York, 2005), pp. 91–97

    Chapter  Google Scholar 

  68. S. Luke, L. Panait, Fighting bloat with nonparametric parsimony pressure, in Proceedings of PPSN-2002, ed. by J.M. Guervos et al. (Springer, Berlin, 2002), pp. 411–420

    Google Scholar 

  69. S. Luke, L. Panait, Lexicographic parsimony pressure, in Proceedings of GECCO-2002, ed. by W.B. Langdon et al. (Morgan Kaufmann, San Francisco, 2002), pp. 829–836

    Google Scholar 

  70. S. Luke, L. Panait, A comparison of bloat control methods for genetic programming. Evol. Comput. 14(3), 309–344 (2006)

    Article  Google Scholar 

  71. P. Martin, R. Poli, Crossover operators for a hardware implementation of genetic programming using FPGAs and Handel-C, in Proceedings of GECCO-2002, ed. by W.B. Langdon et al. (Morgan Kaufmann, San Francisco, 2002), pp. 845–852

    Google Scholar 

  72. N.F. McPhee, J.D. Miller, Accurate replication in genetic programming, in Proceedings of ICGA’95, ed. by L. Eshelman (Morgan Kaufmann, San Francisco, 1995), pp. 303–309

    Google Scholar 

  73. N.F. McPhee, A. Jarvis, E.F. Crane, On the strength of size limits in linear genetic programming, in Proceedings of GECCO-2004, ed. by K. Deb et al. (Springer, Berlin, 2004), pp. 593–604

    Google Scholar 

  74. N.F. McPhee, B. Ohs, T. Hutchison, Semantic building blocks in genetic programming, in Proceedings of EuroGP-2008, ed. by M. O’Neill et al. (Springer, Berlin, 2008), pp. 134–145

    Google Scholar 

  75. N.F. McPhee, R. Poli, A schema theory analysis of the evolution of size in genetic programming with linear representations, in Proceedings of EuroGP-2001, ed. by J. Miller et al. (Springer, Berlin, 2001), pp. 108–125

    Google Scholar 

  76. J. Miller, What bloat? Cartesian genetic programming on Boolean problems, in Late Breaking Papers at GECCO-2001 (2001), pp. 295–302

  77. M. Naoki, B. McKay, N. Xuan, E. Daryl, S. Takeuchi, A new method for simplifying algebraic expressions in genetic programming called equivalent decision simplification, in Proceedings of the 10th International Work-Conference on Artificial Neural Networks (Springer, Berlin, 2009), pp. 171–178

  78. P. Nordin, W. Banzhaf, Complexity compression and evolution, in Proceedings of ICGA’95, ed. by L. Eshelman (Morgan Kaufmann, San Francisco, 1995), pp. 318–325

    Google Scholar 

  79. P. Nordin, W. Banzhaf, F.D. Francone, Efficient evolution of machine code for CISC architectures using instruction blocks and homologous crossover, in Advances in Genetic Programming 3, ed. by L. Spector et al. (MIT Press, Cambridge, 1999), pp. 275–299

    Google Scholar 

  80. P. Nordin, F. Francone, W. Banzhaf, Explicitly defined introns and destructive crossover in genetic programming, in Advances in Genetic Programming 2, ed. by P.J. Angeline, K.E. Kinnear Jr. (MIT Press, Cambridge, 1996), pp. 111–134

  81. J. Page, R. Poli, W.B. Langdon, Smooth uniform crossover with smooth point mutation in genetic programming: a preliminary study, in Proceedings of EuroGP-1999, ed. by R. Poli et al. (Springer, Berlin, 1999), pp. 39–49

    Google Scholar 

  82. L. Panait, S. Luke, Alternative bloat control methods, in Proceedings of GECCO-2004, ed. by K. Deb et al. (Springer, Berlin, 2004), pp. 630–641

    Google Scholar 

  83. M.D. Platel, M. Clergue, P. Collard, Maximum homologous crossover for linear genetic programming, in Proceedings of EuroGP-2003, ed. by C. Ryan et al. (Springer, Berlin, 2003), pp. 194–203

    Google Scholar 

  84. R. Poli, General schema theory for genetic programming with subtree-swapping crossover, in Proceedings of EuroGP-2001, ed. by J. Miller et al. (Springer, Berlin, 2001), pp. 143–159

    Google Scholar 

  85. R. Poli, A simple but theoretically-motivated method to control bloat in genetic programming, in Proceedings of EuroGP-2003, ed. by C. Ryan et al. (Springer, Berlin, 2003), pp. 200–210

    Google Scholar 

  86. R. Poli, W.B. Langdon, Genetic programming with one-point crossover, in Proceedings of the Second On-Line World Conference on Soft Computing in Engineering Design and Manufacturing, ed. by P.K. Chawdhry et al. (Springer, Berlin, 1997), pp. 180–189

    Google Scholar 

  87. R. Poli, W.B. Langdon, A new schema theory for genetic programming with one-point crossover and point mutation, in Proceedings of GP’97, ed. by J. Koza et al. (Morgan Kaufmann, San Francisco, 1997), pp. 278–285

    Google Scholar 

  88. R. Poli, W.B. Langdon, On the search properties of different crossover operators in genetic programming, in Proceedings of GP’98, ed. by J. Koza et al. (Morgan Kaufmann, San Francisco, 1998), pp. 293–301

    Google Scholar 

  89. R. Poli, W.B. Langdon, S. Dignum, On the limiting distribution of program sizes in tree-based genetic programming, in Proceedings of EuroGP-2007, ed. by M. Ebner et al. (Springer, Berlin, 2007), pp. 193–204

    Google Scholar 

  90. R. Poli, W.B. Langdon, N.F. McPhee, A Field Guide to Genetic Programming (2008), http://lulu.com, http://www.gp-field-guide.org.uk (With contributions by J.R. Koza)

  91. R. Poli, N.F. McPhee, Parsimony pressure made easy, in Proceedings of GECCO-2008, ed. by M. Keijzer et al. (ACM Press, New York, 2008), pp. 1267–1274

    Chapter  Google Scholar 

  92. R. Poli, N.F. McPhee, L. Vanneschi, The impact of population size on code growth in GP: analysis and empirical validation, in Proceedings of GECCO-2008, ed. by M. Keijzer et al. (ACM Press, New York, 2008), pp. 1275–1282

    Chapter  Google Scholar 

  93. R. Poli, N.F. McPhee, L. Vanneschi, Elitism reduces bloat in genetic programming, in Proceedings of GECCO-2008, ed. by M. Keijzer et al. (ACM Press, New York, 2008), pp. 1343–1344

    Chapter  Google Scholar 

  94. R. Poli, N.F. McPhee, L. Vanneschi, Analysis of the effects of elitism on bloat in linear and tree-based genetic programming, in Genetic Programming Theory and Practice VI, ed. by R. Riolo et al. (Springer, Berlin, 2008), pp. 91–111

    Google Scholar 

  95. A. Ratle, M. Sebag, Avoiding the bloat with probabilistic grammar-guided genetic programming, in Proceedings of the Artificial Evolution 5th International Conference, ed. by P. Collet et al. (Springer, Berlin, 2001), pp. 255–266

    Google Scholar 

  96. J. Rissanen, Modeling by shortest data description. Automatica 14, 465–471 (1978)

    Article  MATH  Google Scholar 

  97. D. Rochat, Programmation génétique parallèle: opérateurs génétiques variés et populations dynamiques. MSc thesis, Université de Lausanne, Université de Genève (2004)

  98. D. Rochat, M. Tomassini, L. Vanneschi, Dynamic size populations in distributed genetic programming, in Proceedings of EuroGP-2005, ed. by M. Keijzer et al. (Springer, Berlin, 2005), pp. 50–61

    Google Scholar 

  99. J.P. Rosca, Generality versus size in genetic programming, in Proceedings of GP’96, ed. by J.R. Koza et al. (MIT Press, Cambridge, 1996), pp. 381–387

    Google Scholar 

  100. J.P. Rosca, Analysis of complexity drift in genetic programming, in Proceedings of GP’97, ed. by J.R. Koza et al. (Morgan Kaufmann, San Francisco, 1997), pp. 286–294

    Google Scholar 

  101. J.P. Rosca, D.H. Ballard, Complexity Drift in Evolutionary Computation with Tree Representations. Technical Report NRL96.5, Computer Science Department, The University of Rochester (1996)

  102. J.P. Rosca, D.H. Ballard, Discovery of subroutines in genetic programming, in Advances in Genetic Programming 2, ed. by P.J. Angeline, K.E. Kinnear Jr. (MIT Press, Cambridge, 1996), pp. 177–202

    Google Scholar 

  103. C. Ryan, Pygmies and civil servants, in Advances in Genetic Programming, ed. by K.E. Kinnear Jr. (MIT Press, Cambridge, 1994), pp. 243–263

    Google Scholar 

  104. S. Silva, Controlling bloat: individual and population based approaches in genetic programming. PhD thesis, Departamento de Engenharia Informatica, Universidade de Coimbra (2008)

  105. S. Silva, J. Almeida, Dynamic maximum tree depth—a simple technique for avoiding bloat in tree-based GP, in Proceedings of GECCO-2003, ed. by E. Cantú-Paz et al. (Springer, Berlin, 2003), pp. 1776–1787

    Google Scholar 

  106. S. Silva, E. Costa, Dynamic limits for bloat control—variations on size and depth, in Proceedings of GECCO-2004, ed. by K. Deb et al. (Springer, Berlin, 2004), pp. 666–677

    Google Scholar 

  107. S. Silva, E. Costa, Resource-limited genetic programming: the dynamic approach, in Proceedings of GECCO-2005, ed. by Beyer H.-G. et al. (ACM Press, New York, 2005), pp. 1673–1680

  108. S. Silva, E. Costa, Comparing tree depth-limits and resource-limited GP, in Proceedings of CEC-2005, ed. by D. Corne et al. (IEEE Press, Pittsburgh, 2005), pp. 920–927

    Google Scholar 

  109. S. Silva, E. Costa, Dynamic limits for bloat control in genetic programming and a review of past and current bloat theories. Genet. Program. Evolvable Mach. 10(2), 141–179 (2009)

    Article  MathSciNet  Google Scholar 

  110. S. Silva, S. Dignum, Extending operator equalisation: fitness based self adaptive length distribution for bloat free GP, in Proceedings of EuroGP-2009, ed. by L. Vanneschi et al. (Springer, Berlin, 2009), pp. 159–170

    Google Scholar 

  111. S. Silva, P.J.N. Silva, E. Costa, Resource-limited genetic programming: replacing tree depth limits, in Proceedings of ICANNGA-2005, ed. by B. Ribeiro et al. (Springer, Berlin, 2005), pp. 243–246

    Google Scholar 

  112. S. Silva, L. Vanneschi, Operator equalisation, bloat and overfitting—a study on human oral bioavailability prediction, in Proceedings of GECCO-2009, ed. by F. Rothlauf et al. (ACM Press, New York, 2009), pp. 1115–1122

    Chapter  Google Scholar 

  113. S. Silva, L. Vanneschi, State-of-the-Art genetic programming for predicting human oral bioavailability of drugs, in Proceedings of the 4th International Workshop on Practical Applications of Computational Biology & Bioinformatics (IWPACBB-2010), ed. by M.P. Rocha et al. (Springer, Berlin, 2010), pp. 165–173

  114. S. Silva, M.J. Vasconcelos, J.B. Melo, Bloat free genetic programming versus classification trees for identification of burned areas in satellite imagery, in Proceedings of EvoApplications 2010, Evolutionary Computation in Image Analysis and Signal Processing (EvoIASP-2010), ed. by Di Chio C. et al. (Springer, Berlin, 2010), pp. 272–281

  115. S.F. Smith, A learning system based on genetic adaptive algorithms. PhD thesis, University of Pittsburgh, Pittsburgh, PA, USA (1980). AAI8112638

  116. P.W.H. Smith, K. Harries, Code growth, explicitly defined introns, and alternative selection schemes. Evol. Comput. 6(4), 339–360 (1998)

    Article  Google Scholar 

  117. T. Soule, J.A. Foster, Removal bias: a new cause of code growth in tree based evolutionary programming, in Proceedings of the 1998 IEEE International Conference on Evolutionary Computation (IEEE Press, Piscataway, 1998), pp. 781–786

  118. T. Soule, Code growth in genetic programming. PhD thesis, College of Graduate Studies, University of Idaho (1998)

  119. T. Soule, J. Foster, Code size and depth flows in genetic programming, in Proceedings of GP’97, ed. by J. Koza et al. (Morgan Kaufmann, San Francisco, 1997), pp. 313–320

    Google Scholar 

  120. T. Soule, J.A. Foster, Effects of code growth and parsimony pressure on populations in genetic programming. Evol. Comput. 6(4), 293–309 (1998)

    Article  Google Scholar 

  121. T. Soule, J. Foster, J. Dickinson, Code growth in genetic programming, in Proceedings of GP’96, ed. by J. Koza et al. (MIT Press, Cambridge, 1996), pp. 215–223

    Google Scholar 

  122. T. Soule, R.B. Heckendorn, An analysis of the causes of code growth in genetic programming. Genet. Program. Evolvable Mach. 3(1), 283–309 (2002)

    Article  MATH  Google Scholar 

  123. L. Spector, Simultaneous evolution of programs and their control structures, in Advances in Genetic Programming 2, ed. by P.J. Angeline, K.E. Kinnear Jr. (MIT Press, Cambridge, 1996), pp. 137–154

    Google Scholar 

  124. J. Stevens, R.B. Heckendorn, T. Soule, Exploiting disruption aversion to control code bloat, in Proceedings of GECCO-2005, ed. by H.-G. Beyer et al. (ACM Press, New York, 2005), pp. 1605–1612

    Chapter  Google Scholar 

  125. W.A. Tackett, Recombination, selection, and the genetic construction of genetic programs. PhD thesis, Department of Electrical Engineering Systems, University of Southern California (1994)

  126. M. Tomassini, L. Vanneschi, J. Cuendet, F. Fernandez, A new technique for dynamic size populations in genetic programming, in Proceedings of CEC-2004 (IEEE Press, Piscataway, 2004), pp. 486–493

  127. T. Van Belle, D.H. Ackley, Uniform subtree mutation, in Proceedings of EuroGP-2002, ed. by J.A. Foster et al. (Springer, Berlin, 2002), pp. 152–161

    Google Scholar 

  128. L. Vanneschi, Theory and practice for efficient genetic programming. PhD thesis, Faculty of Sciences, University of Lausanne (2004)

  129. L. Vanneschi, M. Castelli, S. Silva, Measuring bloat, overfitting and functional complexity in genetic programming, in Proceedings of GECCO-2010, ed. by J. Branke et al. (ACM Press, New York, 2010), pp. 877–884

  130. L. Vanneschi, S. Silva, Using operator equalisation for prediction of drug toxicity with genetic programming, in Proceedings of EPIA-2009, ed. by L.S. Lopes et al. (Springer, Berlin, 2009), pp. 65–76

    Google Scholar 

  131. L. Vanneschi, M. Tomassini, P. Collard, M. Clergue, Fitness distance correlation in structural mutation genetic programming, in Proceedings of EuroGP-2003, ed. by C. Ryan et al. (Springer, Berlin, 2003), pp. 455–464

    Google Scholar 

  132. N. Wagner, Z. Michalewicz, Genetic programming with efficient population control for financial time series prediction, in Late Breaking Papers at GECCO-2001 (2001), pp. 458–462

  133. B.-T. Zhang, H. Mühlenbein, Balancing accuracy and parsimony in genetic programming. Evol. Comput. 3(1), 17–38 (1995)

    Article  Google Scholar 

  134. B.-T. Zhang, A taxonomy of control schemes for genetic code growth. Position paper at the workshop on evolutionary computation with variable size representation at ICGA-97 (1997)

  135. B.-T. Zhang, Bayesian methods for efficient genetic programming. Genet. Program. Evolvable Mach. 1(1), 217–242 (2000)

    Article  MATH  Google Scholar 

Download references

Acknowledgments

This work was partially supported by FCT (INESC-ID multiannual funding) through the PIDDAC Program funds. The authors acknowledge project PTDC/EIA-CCO/103363/2008 from Fundação para a Ciência e a Tecnologia, Portugal. Acnowledgements also to Sean Luke for suggesting the Minimum Child Size experiments described in Sect. 4 and to Riccardo Poli for the permission to use the Lagrange distribution graphs presented in the same section. Thank you also to the anonymous reviewers of [109] for providing some of the references of Sect. 3, and to the reviewers of the current work for providing so many helpful suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sara Silva.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Silva, S., Dignum, S. & Vanneschi, L. Operator equalisation for bloat free genetic programming and a survey of bloat control methods. Genet Program Evolvable Mach 13, 197–238 (2012). https://doi.org/10.1007/s10710-011-9150-5

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10710-011-9150-5

Keywords

Navigation