Abstract
The Mallows (MM) and the Generalized Mallows (GMM) probability models have demonstrated their validity in the framework of Estimation of distribution algorithms (EDAs) for solving permutation-based combinatorial optimisation problems. Recent works, however, have suggested that the performance of these algorithms strongly relies on the distance used under the model. The goal of this paper is to review three common distances for permutations, Kendall’s-\(\tau \), Cayley and Ulam, and compare their performance under MM and GMM EDAs. Moreover, with the aim of predicting the most suitable distance for solving any given permutation problem, we focus our attention on the relation between these distances and the neighbourhood systems in the field of local search optimisation. In this sense, we demonstrate that the performance of the MM and GMM EDAs is strongly correlated with that of multistart local search algorithms when using related neighbourhoods. Furthermore, by means of fitness landscape analysis techniques, we show that the suitability of a distance to solve a problem is clearly characterised by the generation of high smoothness fitness landscapes.
Similar content being viewed by others
Notes
With parameter we refer to the smallest component of an instance that is used to calculate the fitness of a solution.
Supplementary results, source codes, instances, and extended material of the experiments can be downloaded from http://www.sc.ehu.es/ccwbayes/members/jceberio/COAP/Review_Distances.html.
The ARPD results have been calculated using the best known results obtained by the MLSs. Results from the previous section have not been used here.
For the specific expressions we refer the interested reader to address the original paper [37].
References
Baker, K.: Introduction to Sequencing and Scheduling. Wiley, Hoboken (1974)
Box, G.E.P., Jenkins, G.: Time Series Analysis, Forecasting and Control. Holden-Day, Incorporated, San Francisco (1970)
Ceberio, J., Irurozki, E., Mendiburu, A., Lozano, J.A.: A review on estimation of distribution algorithms in permutation-based combinatorial optimization problems. Prog. Artif. Intell. 1(1), 103–117 (2012)
Ceberio, J., Irurozki, E., Mendiburu, A., Lozano, J.A.: A distance-based ranking model estimation of distribution algorithm for the flowshop scheduling problem. IEEE Trans. Evol. Comput. 18(2), 286–300 (2014)
Ceberio, J., Irurozki, E., Mendiburu, A., Lozano, J.A.: Extending distance-based ranking models in estimation of distribution algorithms. In: Proceedings of the 2014 IEEE Congress on Evolutionary Computation. IEEE (2014)
Ceberio, J., Mendiburu, A., Lozano, J.A.: Introducing the Mallows Model on Estimation of Distribution Algorithms. In: Lu B.L., Zhang L., Kwok J.T. (eds.) Proceedings of International Conference on Neural Information Processing (ICONIP). Lecture Notes in Computer Science, pp. 461–470. Springer (2011)
Ceberio, J., Mendiburu, A., Lozano, J.A.: The Plackett-Luce ranking model on permutation-based optimization problems. In: Proceedings of the 2013 IEEE Congress on Evolutionary Computation, pp. 494–501 (2013)
Ceberio, J., Mendiburu, A., Lozano, J.A.: The linear ordering problem revisited. Eur. J. Oper. Res. 241(3), 686–696 (2014)
Diaconis, P.: Group Representations in Probability and Statistics. Institute of Mathematical Statistics Lecture Notes—Monograph Series, 11. Institute of Mathematical Statistics, Hayward (1988)
Fligner, M., Verducci, J.: Multistage ranking models. J. Am. Stat. Assoc. 83(403), 892–901 (1988)
Garcia, S., Herrera, F.: An extension on “statistical comparisons of classifiers over multiple data set” for all pairwise comparisons. J. Mach. Learn. Res. 9, 2677–2694 (2008)
Garcia, S., Molina, D., Lozano, M., Herrera, F.: A study on the use of non-parametric tests for analyzing the evolutionary algorithms’ behaviour: a case study on the CEC’2005 Special Session on Real Parameter Optimization. J. Heuristics 15(6), 617–644 (2009)
Goldberg, D.E., Jr., Lingle, R.: Alleles Loci and the Traveling Salesman Problem. In: Proceedings of an International Conference on Genetic Algorithms and Their Applications, vol. 154, pp. 154–159. Lawrence Erlbaum, Hillsdale, NJ (1985)
Inza, I., Larrañaga, P., Sierra, B.: Feature subset selection by estimation of distribution algorithms. In: Larrañaga, P., Lozano, J.A. (eds.) Estimation of Distribution Algorithms. A New Tool for Evolutionary Computation, pp. 269–294. Kluwer Academic Publishers, Dordrecht (2002)
Irurozki, E., Calvo, B., Lozano, J.A.: An r package for permutations, Mallows and Generalized Mallows models. Technical Report, University of the Basque Country UPV/EHU (2014)
Irurozki, E., Calvo, B., Lozano, J.A.: Sampling and learning Mallows and Generalized Mallows models under the Cayley distance. Technical Report, University of the Basque Country UPV/EHU (2014)
Irurozki, E., Calvo, B., Lozano, J.A.: Sampling and learning the Mallows model under the ulam distance. Technical Report, University of the Basque Country UPV/EHU (2014)
Koopmans, T.C., Beckmann, M.J.: Assignment Problems and the Location of Economic Activities. Cowles Foundation Discussion Papers 4, Cowles Foundation for Research in Economics, Yale University (1955)
Larrañaga, P., Lozano, J.A.: Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation. Kluwer Academic Publishers, Dordrecht (2002)
Lozano, J.A., Larrañaga, P., Inza, I., Bengoetxea, E.: Towards a New Evolutionary Computation: Advances on Estimation of Distribution Algorithms (Studies in Fuzziness and Soft Computing). Springer, Secaucus (2006)
Lozano, J.A., Mendiburu, A.: Estimation of distribution algorithms applied to the job schedulling problem. In: Larrañaga, P., Lozano, J.A. (eds.) Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation. Kluwer Academic Publishers, Dordrecht (2002)
Manderick, B., de Weger, M., Spiessens, P.: The genetic algorithm and the structure of the fitness landscape. In: Proceedings of the 4th International Conference on Genetic Algorithms, pp. 143–150 (1991)
Mandhani, B., Meila, M.: Tractable search for learning exponential models of rankings. J. Mach. Learn. Res. 5, 392–399 (2009)
Marti, R.: Multi-start methods. In: Glover, F., Kochenberger, G. (eds.) Handbook of Metaheuristics, International Series in Operations Research and Management Science, vol. 57, pp. 355–368. Springer, Berlin (2003)
Martí, R., Reinelt, G.: The Linear Ordering Problem: Exact and Heuristic Methods in Combinatorial Optimization, vol. 175. Springer, Berlin (2011)
Pelikan, M., Goldberg, D.E., Lobo, F.G.: A survey of optimization by building and using probabilistic models. Comput. Optim. Appl. 21(1), 5–20 (2002)
Reinelt, G.: TSPLIB—A T.S.P. library. Technical Report 250, Universität Augsburg, Institut für Mathematik, Augsburg (1990)
Robles, V., de Miguel, P., Larrañaga, P.: Solving the traveling salesman problem with EDAs. In: Larrañaga, P., Lozano, J.A. (eds.) Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation. Kluwer Academic Publishers, Dordrecht (2002)
Santana, R., Bielza, C., Larranaga, P., Lozano, J.A., Echegoyen, C., Mendiburu, A., Armananzas, R., Shakya, S.: Mateda-2.0: estimation of distribution algorithms in matlab. J. Stat. Softw. 35(7), 1–30 (2010)
Santana, R., Larrañaga, P., Lozano, J.A.: Combining variable neighborhood search and estimation of distribution algorithms in the protein side chain placement problem. J. Heuristics 14(5), 519–547 (2008)
Santana, R., Larrañaga, P., Lozano, J.A.: Protein folding in simplified models with estimation of distribution algorithms. IEEE Trans. Evol. Comput. 12(4), 418–438 (2008)
Schiavinotto, T., Stützle, T.: The linear ordering problem: instances, search space analysis and algorithms. J. Math. Model. Algorithms 3, 367–402 (2004)
Schiavinotto, T., Stützle, T.: A review of metrics on permutations for search landscape analysis. Comput. Oper. Res. 34(10), 3143–3153 (2007)
Taillard, E.: Benchmarks for basic scheduling problems. Eur. J. Oper. Res. 64(2), 278–285 (1993)
Taillard, E.: Problem instances (1993). http://mistic.heig-vd.ch/taillard/problemes.dir/problemes.html
Tsutsui, S., Wilson, G.: Solving Capacitated Vehicle Routing Problems Using Edge Histogram Based Sampling Algorithms. In: Proceedings of the IEEE Conference on Evolutionary Computation, pp. 1150–1157. Portland, Oregon (2004)
Vassilev, V.K., Fogarty, T.C., Miller, J.F.: Information characteristics and the structure of landscapes. Evol. Comput. 8(1), 31–60 (2000)
Weinberger, E.: Correlated and uncorrelated fitness landscapes and how to tell the difference. Biol. Cybern. 63(5), 325–336 (1990)
Zhang, Q., Sun, Q., Tsang, E.: An evolutionary algorithm with guided mutation for the maximum clique problem. IEEE Trans. Evol. Comput. 9, 192–200 (2005)
Zhang, Q., Sun, J., Tsang, E., Ford, J.: Estimation of distribution algorithm with 2-opt local search for the quadratic assignment problem. Stud. Fuzziness Soft Comput. 192(2006), 281–292 (2006)
Acknowledgments
This work has been partially supported by the Saiotek and Research Groups 2013-2018 (IT-609-13) programs (Basque Government), TIN2010-14931 (Ministry of Science and Technology), COMBIOMED network in computational bio-medicine (Carlos III Health Institute) and the NICaiA Project PIRSES-GA-2009-247619 (European Commission). Josu Ceberio holds a grant from the Basque Government.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Ceberio, J., Irurozki, E., Mendiburu, A. et al. A review of distances for the Mallows and Generalized Mallows estimation of distribution algorithms. Comput Optim Appl 62, 545–564 (2015). https://doi.org/10.1007/s10589-015-9740-x
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10589-015-9740-x