Skip to main content
Log in

A roadmap for solving optimization problems with estimation of distribution algorithms

  • Published:
Natural Computing Aims and scope Submit manuscript

Abstract

In recent decades, Estimation of Distribution Algorithms (EDAs) have gained much popularity in the evolutionary computation community for solving optimization problems. Characterized by the use of probabilistic models to represent the solutions and the interactions between the variables of the problem, EDAs can be applied to either discrete, continuous or mixed domain problems. Due to this robustness, these algorithms have been used to solve a diverse set of real-world and academic optimization problems. However, a straightforward application is only limited to a few cases, and for the general case, an efficient application requires intuition from the problem as well as notable understanding in probabilistic modeling. In this paper, we provide a roadmap for solving optimization problems via EDAs. It is not the aim of the paper to provide a thorough review of EDAs, but to present a guide for those practitioners interested in using the potential of EDAs when solving optimization problems. In order to present a roadmap which is as useful as possible, we address the key aspects involved in the design and application of EDAs, in a sequence of stages: (1) the choice of the codification, (2) the choice of the probability model, (3) strategies to incorporate knowledge about the problem to the model, and (4) balancing the diversification-intensification behavior of the EDA. At each stage, first, the contents are presented together with common practices and advice to follow. Then, an illustration is given with an example which shows different alternatives. In addition to the roadmap, the paper presents current open challenges when developing EDAs, and revises paths for future research advances in the context of EDAs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Notes

  1. The term search space has been used in a great number of optimization papers with very different meanings. Since different spaces are considered in this paper, in order to avoid confusions produced by already-held beliefs, we decided to avoid that term.

  2. In this TSP variant, the cost involved of travelling between cities i and j is equal in either direction.

  3. A permutation is understood as a bijection \(\sigma\) of the set of natural numbers \(\{1,\ldots , n\}\) onto itself.

  4. We do not make any consideration regarding the sampling mechanism, and focus exclusively on the fact that a solution represented by two different individuals is assigned with different probabilities

  5. A summarized introduction to Bayesian statistics can be found in Calvo et al. (2018).

References

  • Alden M, Miikkulainen R (2016) MARLEDA: effective distribution estimation through Markov random fields. Theoret Comput Sci 633:4–18

    Article  MathSciNet  Google Scholar 

  • Alza J, Ceberio J, Calvo B (2018) Balancing the diversification-intensification trade-off using mixtures of probability models. In: 2018 IEEE congress on evolutionary computation (CEC), pp 1–8

  • Armañanzas R, Inza I, Santana R, Saeys Y, Flores J, Lozano J, Van de Peer Y, Blanco R, Robles V, Bielza C, Larrañaga P (2008) A review of estimation of distribution algorithms in bioinformatics. BioData Min 1(1):6

    Article  Google Scholar 

  • Arza E, Perez A, Irurozki E, Ceberio J (2020) Kernels of mallows models under the hamming distance for solving the quadratic assignment problem. Swarm Evol Comput 59:100740

    Article  Google Scholar 

  • Ayodele M, McCall J, Regnier-Coudert O, Bowie L (2017) A random key based estimation of distribution algorithm for the permutation flowshop scheduling problem. In: 2017 IEEE congress on evolutionary computation (CEC), pp 2364–2371

  • Baluja S (2006) Scalable optimization via probabilistic modeling. Studies in computational intelligence, volume 33, chapter incorporating a priori knowledge in probabilistic-model based optimization, pp 205–222. Springer, Berlin

  • Bengio Y, Lodi A, Prouvost A (2018) Machine learning for combinatorial optimization: a methodological tour d’Horizon. Eur J Oper Res 290():405–421

  • Bosman PAN, Grahl J (2008) Matching inductive search bias and problem structure in continuous estimation-of-distribution algorithms. Eur J Oper Res 185(3):1246–1264

    Article  Google Scholar 

  • Bosman PAN, Thierens D (2000) Mixed IDEAs. Technical report, Utrech University

  • Bosman PAN, Thierens D (2002) Mult-objective optimization with diversity preserving mixture-based iterated density estimation evolutionary algorithms. Int J Approx Reason 31(3):259–289

    Article  Google Scholar 

  • Brochu E, Cora VM, De Freitas N (2010) A tutorial on bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. arxiv.1012.2599

  • Brownlee A, Pelikan M, McCall J, Petrovski A (2008) An application of a multivariate estimation of distribution algorithm to cancer chemotherapy. In: Proceedings of the 2008 ACM genetic and evolutionary computation conference, pp 463–464

  • Calvo B, Ceberio J, Lozano JA (2018) Bayesian inference for algorithm ranking analysis. In: Proceedings of the genetic and evolutionary computation conference companion, GECCO ’18, pp 324–325, New York, NY, USA, ACM

  • Cappart Q, Chételat D, Khalil E, Lodi A, Morris C, Veličković P (2021) Combinatorial optimization and reasoning with graph neural networks

  • Carnero M, Hernández J, Sánchez M (2018) Optimal sensor location in chemical plants using the estimation of distribution algorithms. Ind Eng Chem Res 57(36):12149–12164

    Article  Google Scholar 

  • Ceberio J, Irurozki E, Mendiburu A, Lozano JA (2012) A review on estimation of distribution algorithms in permutation-based combinatorial optimization problems. Progress Artif Intell 1(1):103–117

    Article  Google Scholar 

  • Ceberio J, Mendiburu A, Lozano JA (2013) The Plackett-Luce ranking model on permutation-based optimization problems. In: 2013 IEEE congress on evolutionary computation, pp 494–501

  • Ceberio J, Mendiburu A, Lozano JA (2017) A square lattice probability model for optimising the graph partitioning problem. In: 2017 IEEE Congress on evolutionary computation (CEC), pp 1629–1636. IEEE

  • Ceberio J, Mendiburu A, Lozano JA (2018) Distance-based exponential probability models on constrained combinatorial optimization problems. In: 2018 genetic and evolutionary computation conference (GECCO-2018), Kyoto, Japan, pp 137–138. ACM

  • Chen X, Tian Y (2019) Learning to perform local rewriting for combinatorial optimization. In: Advances in neural information processing systems (NeurIPS 2019), vol 32. ISBN: 9781713807933.

  • Crispino M, Antoniano-Villalobos I (2019)Informative extended mallows priors in the bayesian mallows model. ArXiV. arXiv:1901.10870

  • Critchlow JVD, Fligner M (1991) Probability models on ranking. J Math Psychol 35:294–318

    Article  MathSciNet  Google Scholar 

  • Dai H, Khalil EB, Zhang Y, Dilkina B, Song B (2017) Learning combinatorial optimization algorithms over graphs. In: Advances in neural information processing systems, vol 2017-Decem, pp 6349–6359

  • De Bonet JS, Isbell CL, Jr, Viola P (1996) Mimic: finding optima by estimating probability densities. In: Proceedings of the 9th international conference on neural information processing systems, NIPS’96, pp. 424–430, Cambridge, MA, USA, 1996. MIT Press

  • Doignon J-P, Pekeč A, Regenwetter M (2004) The repeated insertion model for rankings: Missing link between two subset choice models. Psychometrika 69(1):33–54

    Article  MathSciNet  Google Scholar 

  • Echegoyen C, Lozano J, Santana R, Larranaga P (2007) Exact bayesian network learning in estimation of distribution algorithms. pp 1051–1058

  • Echegoyen C, Mendiburu A, Santana R, Lozano JA (2012) Toward understanding edas based on bayesian networks through a quantitative analysis. IEEE Trans Evol Comput 16(2):173–189

    Article  Google Scholar 

  • Echegoyen C, Mendiburu A, Santana R, Lozano JA (2013) On the taxonomy of optimization problems under estimation of distribution algorithms. Evol Comput 21(3):471–495

    Article  Google Scholar 

  • Etxeberria R, Larrañaga P (1999) Global optimization with bayesian networks. In: II symposium on artificial intelligence, special session on distributions and evolutionary optimization, CIMAF99, pp 332–339

  • Fard MR, Mohaymany AS (2019) A copula-based estimation of distribution algorithm for calibration of microscopic traffic models. Transp Res Part C Emerg Technol 98:449–470

    Article  Google Scholar 

  • Fligner MA, Verducci JS (1986) Distance based ranking Models. J R Stat Soc 48(3):359–369

    MathSciNet  Google Scholar 

  • Gallagher M (2000) Multi-layer perceptron error surfaces: visualization, structure and modelling models for iterative global optimization. PhD thesis, Queensland University

  • Glover FLM (1998) Handbook of combinatorial optimization, chapter Tabu search. Springer, Berlin

    Google Scholar 

  • Goff L, Buchanan E, Hart E, Eiben A, Li W, de Carlo M, Hale M, Angus M, Woolley R, Timmis J, Winfield A, Tyrrell A (2020) Sample and time efficient policy learning with cma-es and bayesian optimisation. pp 432–440

  • Goldberg DE (1989) Genetic algorithms in search, optimization, and machine learning. Addison/Wesley, Reading MA

    Google Scholar 

  • Goldberg DE, Lingle R (1985) Alleles, Loci and the traveling salesman problem. In: ICGA, pp 154–159

  • Hauschild M, Pelikan M, Sastry K, Goldberg D (2011) Using previous models to bias structural learning in the hierarchical boa. Evol Comput 20:135–160

    Article  Google Scholar 

  • Höns R (2012) Using maximum entropy and generalized belief propagation in estimation of distribution algorithms. In: Shakya S, Santana R (editors) Markov networks in evolutionary computation. Springer, pp 175–190

  • Irurozki E (2014) Sampling and learning distance-based probability models for permutation spaces. PhD thesis, University of the Basque Country

  • Irurozki E, Ceberio J, Santamaria J, Santana R, Mendiburu A (2018) Algorithm 989: Perm_mateda: a matlab toolbox of estimation of distribution algorithms for permutation-based combinatorial optimization problems. ACM Trans Math Softw 44(4):47:1-47:13

    Article  Google Scholar 

  • Iyer PVK (1950) The theory of probability distributions of points on a lattice. Ann Math Stat 21(2):198–217

    Article  MathSciNet  Google Scholar 

  • Jiang S, Ziver A, Carter J, Pain C, Goddard A, Franklin S, Phillips H (2006) Estimation of distribution algorithms for nuclear reactor fuel management optimisation. Ann Nucl Energy 33(11–12):1039–1057

    Article  Google Scholar 

  • Joshi CK, Cappart Q, Rousseau L-M, Laurent T, Bresson X (2020) Learning TSP requires rethinking generalization. pp 1–22

  • Kirkpatrick S, Gelatt CD Jr, Vecchi MP (1983) Optimization by simulated annealing. Science 220(4598):671–680

    Article  MathSciNet  Google Scholar 

  • Kollat JB, Reed PM, Kasprzyk JR (2008) A new epsilon-dominance hierarchical bayesian optimization algorithm for large multi-objective monitoring network design problems. Adv Water Resour 31(5):828–845

    Article  Google Scholar 

  • Krejca M, Witt C (2018) Theory of estimation-of-distribution algorithms. CoRR, abs/1806.05392

  • Lan G, Tomczak J, Roijers D, Eiben A (2020) Time efficiency in optimization with a bayesian-evolutionary algorithm

  • Lange K (2010) Numerical analysis for statisticians. Springer, Berlin

    Book  Google Scholar 

  • Larrañaga P, Lozano JA (2002) Estimation of distribution algorithms: a new tool for evolutionary computation. Kluwer Academic Publishers, New York

    Book  Google Scholar 

  • Lebanon G, Mao Y (2008) Non-parametric modeling of partially ranked data. J Mach Learn Res (JMLR) 9:2401–2429

    MathSciNet  Google Scholar 

  • Lima CF, Pelikan M, Lobo FG, Goldberg DE (2009) Engineering stochastic local search algorithms. designing, implementing and analyzing effective heuristics, chapter loopy substructural local search for the Bayesian optimization algorithm. Springer, Berlin Heidelberg, pp 61–75

  • Lozano JA, Larrañaga P, Inza I, Bengoetxea E (2006) Towards a new evolutionary computation: advances on estimation of distribution algorithms (studies in fuzziness and soft computing). Springer, New York

    Book  Google Scholar 

  • Lozano JA, Mendiburu A (2002) Solving job schedulling with estimation of distribution algorithms. In: Larrañaga P, Lozano JA (eds) Estimation of distribution algorithms. A new tool for evolutionary computation, pp. 231–242. Kluwer Academic Publishers, New York

  • Malagon M, Irurozki E, Ceberio J (2020) Alternative representations for codifying solutions in permutation-based problems. In: 2020 IEEE congress on evolutionary computation (CEC), pp 1–8

  • Marden JI (1996) Analyzing and modeling rank data. CRC Press

  • Mendiburu A, Santana R, Lozano JA (2012) Fast fitness improvements in estimation of distribution algorithms using belief propagation. In: Santana R, Shakya S (eds) Markov networks in evolutionary computation. Springer, Berlin, pp 141–155

    Chapter  Google Scholar 

  • Mezuman E, Weiss Y (2012) Globally optimizing graph partitioning problems using message passing. In: Proceedings of the 15th international conference on artificial intelligence and statistics (AISTATS), pp 770–778

  • Mühlenbein H (1998) The equation for response to selection and its use for prediction. Evol Comput 5:303–346

    Article  Google Scholar 

  • Mühlenbein H, Mahnig T (2002) Evolutionary optimization and the estimation of search distributions with applications to graph bipartitioning. Int J Approx Reason 31(3):157–192

    Article  MathSciNet  Google Scholar 

  • Mühlenbein H, Mahnig T (1999) FDA—a scalable evolutionary algorithm for the optimization of additively decomposed functions. Evol Comput 7(4):353–376

    Article  Google Scholar 

  • Mühlenbein H, Paaß G (1996) From recombination of genes to the estimation of distributions I. Binary parameters. In: Lecture notes in computer science 1411: parallel problem solving from nature—PPSN IV, pp 178–187

  • Murphy TB, Martin D (2003) Mixtures of distance-based models for ranking data. Comput Stat Data Anal 41(3–4):645–655

    Article  MathSciNet  Google Scholar 

  • Nogueira BGS, Sechidis K (2017) On the use of spearman’s rho to measure the stability of feature rankings. In: Alexandre RJL, Salvador Sánchez J (eds) Pattern recognition and image analysis. IbPRIA 2017, vol 10255. Springer

  • Pelikan M, Goldberg DE, Lobo FG (2002) A survey of optimization by building and using probabilistic models. Comput Optim Appl 21(1):5–20

    Article  MathSciNet  Google Scholar 

  • Pelikan M, Sastry K, Cantú-Paz E (2006) Scalable optimization via probabilistic modeling: from algorithms to applications (studies in computational intelligence). Springer, New York

    Book  Google Scholar 

  • Peña JM, Lozano JA, Larrañaga P (2005) Globally multimodal problem optimization via an estimation of distribution algorithm based on unsupervised learning of Bayesian networks. Evolut Comput, pp 43–66

  • Regnier-Coudert O, McCall J (2014) Factoradic representation for permutation optimisation. In: Bartz-Beielstein T, Branke J, Filipič B, Smith J (eds) Parallel problem solving from nature—PPSN XIII. Springer, pp 332–341

  • Roman I, Mendiburu A, Santana R, Lozano JA (2020) Bayesian optimization approaches for massively multi-modal problems. In: Matsatsinis NF, Marinakis Y, Pardalos P (eds) Learning and intelligent optimization. Springer, Berlin, pp 383–397

    Chapter  Google Scholar 

  • Santana R, Bielza C, Larranaga P, Lozano JA, Echegoyen C, Mendiburu A, Armananzas R, Shakya S (2010) Mateda-2.0: estimation of distribution algorithms in matlab. J Stat Softw 35(7):1–30

    Article  Google Scholar 

  • Santana R, Larrañaga P, Lozano JA (2007) Challenges and open problems in discrete edas. Technical report, Department of Computer Science and Artificial Intelligence, University of the Basque Country

  • Santana R, Larrañaga P, Lozano JA (2008) Protein folding in simplified models with estimation of distribution algorithms. IEEE Trans Evol Comput 12:418–438

    Article  Google Scholar 

  • Santana R, Mendiburu A, Zaitlen N, Eskin E, Lozano JA (2010) Multi-marker tagging single nucleotide polymorphism selection using estimation of distribution algorithms. Artif Intell Med 50(3):193–201

    Article  Google Scholar 

  • Schwarz J, Ocenasek J (2000) A problem knowledge-based evolutionary algorithm KBOA for hypergraph bisectioning. In: Proceedings of the 4th joint conference on knowledge-based software engineering. IOS Press, pp 51-58

  • Shakya S, McCall J (2007) Optimization by estimation of distribution with DEUM framework based on Markov random fields. Int J Autom Comput 4(3):262–272

    Article  Google Scholar 

  • Shakya S, Santana R (2012) Markov networks in evolutionary computation. Springer, Berlin

    Book  Google Scholar 

  • Shapiro JL (2005) Drift and scaling in estimation of distribution algorithms. Evol Comput 13(1):99–123

    Article  Google Scholar 

  • Soto M, Gonzalez-Fernandez Y, Ochoa-Zezzatti C (2015) Modeling with copulas and vines in estimation of distribution algorithms. Inves Oper 36:1–23

    MathSciNet  Google Scholar 

  • Thurstone L (1927) A law of comparative judgment. Psychol Rev 34:273–286

  • Vitelli V, Sørensen Ø, Crispino M, Frigessi A, Arjas E (2017) Probabilistic preference learning with the mallows rank model. J Mach Learn Res 18(1):5796–5844

    MathSciNet  Google Scholar 

  • Wang C, Ma H, Chen G, Hartmann S (2018) Towards fully automated semantic web service composition based on estimation of distribution algorithm. In: Mitrovic T, Xue B, Li X (eds) AI 2018: advances in artificial intelligence. Springer, Cham, pp 458–471

    Google Scholar 

  • Wright AH, Pulavarty S (2005) Estimation of distribution algorithm based on linkage discovery and factorization. In: 2007 genetic and evolutionary computation conference (GECCO-2005), Washington D.C., USA. ACM, pp 695–703

  • Wu Y, Song W, Cao Z, Zhang J, Lim A (2021) Learning improvement heuristics for solving routing problems. IEEE Trans Neural Netw Learn Syst, pp 1-13

  • Zhang Q, Muhlenbein H (2004) On the convergence of a class of estimation of distribution algorithms. IEEE Trans Evol Comput 8(2):127–136

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Josu Ceberio.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work has been partially supported by the, ELKARTEK program (KK-2020/00049)   and Research Groups 2022–2025 (IT1504-22) from the Basque Government, the PID2019-106453GA-I00 and PID2019-104933GB-10 research projects from the Spanish Ministry of Science and Innovation.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ceberio, J., Mendiburu, A. & Lozano, J.A. A roadmap for solving optimization problems with estimation of distribution algorithms. Nat Comput 23, 99–113 (2024). https://doi.org/10.1007/s11047-022-09913-2

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11047-022-09913-2

Keywords

Navigation