skip to main content
10.1145/2598394.2609875acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
technical-note

Benchmarks that matter for genetic programming

Published:12 July 2014Publication History

ABSTRACT

There have been several papers published relating to the practice of benchmarking in machine learning and Genetic Programming (GP) in particular. In addition, GP has been accused of targeting over-simplified 'toy' problems that do not reflect the complexity of real-world applications that GP is ultimately intended. There are also theoretical results that relate the performance of an algorithm with a probability distribution over problem instances, and so the current debate concerning benchmarks spans from the theoretical to the empirical.

The aim of this article is to consolidate an emerging theme arising from these papers and suggest that benchmarks should not be arbitrarily selected but should instead be drawn from an underlying probability distribution that reflects the problem instances which the algorithm is likely to be applied to in the real-world. These probability distributions are effectively dictated by the application domains themselves (essentially data-driven) and should thus re-engage the owners of the originating data.

A consequence of properly-founded benchmarking leads to the suggestion of meta-learning as a methodology for automatically designing algorithms rather than manually designing algorithms. A secondary motive is to reduce the number of research papers that propose new algorithms but do not state in advance what their purpose is (i.e. in what context should they be applied). To put the current practice of GP benchmarking in a particular harsh light, one might ask what the performance of an algorithm on Koza's lawnmower problem (a favourite toy-problem of the GP community) has to say about its performance on a very real-world cancer data set: the two are completely unrelated.

References

  1. Juergen Branke and Jawad Asem Elomari. Meta-optimization for parameter tuning with a flexible computing budget. In Proceedings of the fourteenth international conference on Genetic and evolutionary computation conference, GECCO '12, pages 1245--1252, New York, NY, USA, 2012. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Gavin Brown. A new perspective for information theoretic feature selection. In International Conference on Artificial Intelligence and Statistics, pages 49--56, 2009.Google ScholarGoogle Scholar
  3. Edmund Burke, Graham Kendall, Jim Newall, Emma Hart, Peter Ross, and Sonia Schulenburg. Hyper-heuristics: An emerging direction in modern search technology. In Fred Glover, Gary Kochenberger, and Frederick S. Hillier, editors, Handbook of Metaheuristics, volume 57 of International Series in Operations Research and Management Science, pages 457--474. Springer New York, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  4. Edmund K. Burke, Mathew R. Hyde, Graham Kendall, Gabriela Ochoa, Ender Ozcan, and John R. Woodward. Exploring hyper-heuristic methodologies with genetic programming. In Computational intelligence, pages 177--201. Springer Berlin Heidelberg, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  5. Edmund K. Burke, Matthew R. Hyde, Graham Kendall, and John Woodward. Automatic heuristic generation with genetic programming: evolving a jack-of-all-trades or a master of one. In GECCO '07: Proceedings of the 9th annual conference on Genetic and evolutionary computation, volume 2, pages 1559--1565, London, 7-11 July 2007. ACM Press. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Zehra Cataltepe, Yaser S. Abu-Mostafa, and Malik Magdon-Ismail. No free lunch for early stopping. Neural Comput., 11:995--1009, May 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Stefan Droste, Thomas Jansen, and Ingo Wegener. Perhaps Not a Free Lunch But At Least a Free Appetizer. In Wolfgang Banzhaf, Jason Daida, Agoston E. Eiben, Max H. Garzon, Vasant Honavar, Mark Jakiela, and Robert E. Smith, editors, Proceedings of the Genetic and Evolutionary Computation Conference GECCO-1999, pages 833--839, San Francisco, CA, 1999. Morgan Kaufmann Publishers, Inc.Google ScholarGoogle Scholar
  8. Stefan Droste, Thomas Jansen, and Ingo Wegener. Optimization with randomized search heuristics - the (a)nfl theorem, realistic scenarios, and difficult functions. Theor. Comput. Sci., 287(1):131--144, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Edgar A. Duéñez Guzmán and Michael D. Vose. No Free Lunch and Benchmarks. Evolutionary Computation, pages 1--20, March 2012.Google ScholarGoogle Scholar
  10. Peter Flach. Machine Learning: The art and science of algorithms that make sense of data. Cambridge University Press, September 2012. Google ScholarGoogle ScholarCross RefCross Ref
  11. C. Giraud-Carrier and F. Provost. Toward a Justification of Meta-learning: Is the No Free Lunch Theorem a Show-stopper? In Proceedings of the ICML-2005 Workshop on Meta-learning, pages 12--19, 2005.Google ScholarGoogle Scholar
  12. Libin Hong, John Woodward, Jingpeng Li, and Ender Ozcan. Automated design of probability distributions as mutation operators for evolutionary programming using genetic programming. In Krzysztof Krawiec, Alberto Moraglio, Ting Hu, A. Sima Uyar, and Bin Hu, editors, Proceedings of the 16th European Conference on Genetic Programming, EuroGP 2013, volume 7831 of LNCS, pages 85--96, Vienna, Austria, 3-5 April 2013. Springer Verlag. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Marcus Hutter. Universal Artificial Intelligence: Sequential Decisions based on Algorithmic Probability. EATCS. Springer, Berlin, 2005. 300 pages, http://www.hutter1.net/ai/uaibook.htm. Google ScholarGoogle ScholarCross RefCross Ref
  14. Marcus Hutter. A complete theory of everything (will be subjective). Algorithms, 3(4):329--350, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  15. Christian Igel and Marc Toussaint. On classes of functions for which no free lunch results hold. Inf. Process. Lett., 86(6):317--321, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Christian Igel and Marc Toussaint. Recent results on no-free-lunch theorems for optimization. CoRR, cs.NE/0303032, 2003.Google ScholarGoogle Scholar
  17. Christian Igel and Marc Toussaint. A no-free-lunch theorem for nonuniform distributions of target functions. Journal of Mathematical Modeling and Algorithms, page 313, 2004.Google ScholarGoogle Scholar
  18. D. S. Johnson. A Theoretician's Guide to the Experimental Analysis of Algorithms. In 5th and 6th DIMACS Implementation Challenges. American Mathematical Society, 2002.Google ScholarGoogle Scholar
  19. Sean Luke. Essentials of Metaheuristics. Lulu, second edition, 2013. Available for free at http://cs.gmu.edu/~sean/book/metaheuristics/.Google ScholarGoogle Scholar
  20. James McDermott, David R. White, Sean Luke, Luca Manzoni, Mauro Castelli, Leonardo Vanneschi, Wojciech Jaskowski, Krzysztof Krawiec, Robin Harper, Kenneth De Jong, and Una-May O'Reilly. Genetic programming needs better benchmarks. In GECCO '12: Proceedings of the fourteenth international conference on Genetic and evolutionary computation conference, pages 791--798, Philadelphia, Pennsylvania, USA, 7-11 July 2012. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Tom M. Mitchell. The need for biases in learning generalizations. Technical report, Rutgers University, New Brunswick, NJ, 1980.Google ScholarGoogle Scholar
  22. Gisele L. Pappa and Alex A. Freitas. Automatically Evolving Data Mining Algorithms, volume XIII of Natural Computing Series. Springer, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  23. Gisele L. Pappa, Gabriela Ochoa, Matthew R. Hyde, Alex A. Freitas, John Woodward, and Jerry Swan. Contrasting meta-learning and hyper-heuristic research: the role of evolutionary algorithms. Genetic Programming and Evolvable Machines, pages 1--33. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Riccardo Poli, Leonardo Vanneschi, William B. Langdon, and Nicholas Freitag Mcphee. Theoretical results in genetic programming: The next ten years? Genetic Programming and Evolvable Machines, 11(3-4):285--320, September 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. J. Ross Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Jonathan E. Rowe and Michael D. Vose. Unbiased black box search algorithms. In GECCO, pages 2035--2042, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. C. Schaffer. A conservation law for generalization performance. In Proceedings of the Eleventh International Conference on Machine Learning, pages 259--265. Morgan Kaufmann, 1994.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. C. Schumacher, M. D. Vose, and L. D. Whitley. The no free lunch and problem description length. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2001), pages 565--570. Morgan Kaufmann, 2001.Google ScholarGoogle Scholar
  29. Kenneth Sörensen. Metaheuristics - the metaphor exposed. International Transactions in Operational Research, 2013.Google ScholarGoogle Scholar
  30. Matthew J. Streeter. Two broad classes of functions for which a no free lunch result does not hold. In Proc. Genetic and Evolutionary Computation Conference GECCO-2003, pages 1418--1430. Morgan Kaufmann, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. El-Ghazali Talbi. Metaheuristics - From Design to Implementation. Wiley, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Sebastian Thrun and Lorien Pratt, editors. Learning to learn. Kluwer Academic Publishers, Norwell, MA, USA, 1998. Google ScholarGoogle ScholarCross RefCross Ref
  33. Kiri Wagstaff. Machine learning that matters. CoRR, abs/1206.4656, 2012.Google ScholarGoogle Scholar
  34. David R White, James McDermott, Mauro Castelli, Luca Manzoni, Brian W Goldman, Gabriel Kronberger, Wojciech Jaśkowski, Una-May O' Reilly, and Sean Luke. Better gp benchmarks: community survey results and proposals. Genetic Programming and Evolvable Machines, 14(1):3--29, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. D. H. Wolpert and W. G. Macready. No free lunch theorems for optimization. Evolutionary Computation, IEEE Transactions on, 1(1):67--82, April 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. John R. Woodward. The necessity of meta bias in search algorithms. In Computational Intelligence and Software Engineering (CiSE), 2010 International Conference on, pages 1--4. IEEE, 2010.Google ScholarGoogle ScholarCross RefCross Ref
  37. John R. Woodward and Jerry Swan. The automatic generation of mutation operators for genetic algorithms. In Gisele L. Pappa, John Woodward, Matthew R. Hyde, and Jerry Swan, editors, GECCO 2012 2nd Workshop on Evolutionary Computation for the Automated Design of Algorithms, pages 67--74, Philadelphia, Pennsylvania, USA, 7-11 July 2012. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. John Robert Woodward and Jerry Swan. Automatically selection heuristics. In Proceedings of the 13th annual conference companion on Genetic and evolutionary computation, GECCO '11, pages 583--590, New York, NY, USA, 2011. ACM.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Huaiyu Zhu and Richard Rohwer. No free lunch for cross-validation. Neural Comput., 8:1421--1426, October 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  1. Benchmarks that matter for genetic programming

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      GECCO Comp '14: Proceedings of the Companion Publication of the 2014 Annual Conference on Genetic and Evolutionary Computation
      July 2014
      1524 pages
      ISBN:9781450328814
      DOI:10.1145/2598394

      Copyright © 2014 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 12 July 2014

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • technical-note

      Acceptance Rates

      GECCO Comp '14 Paper Acceptance Rate180of544submissions,33%Overall Acceptance Rate1,669of4,410submissions,38%

      Upcoming Conference

      GECCO '24
      Genetic and Evolutionary Computation Conference
      July 14 - 18, 2024
      Melbourne , VIC , Australia

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader