Abstract
This paper suggests a new method for generating the Pareto front in multi-objective Markov chains, which overcomes some existing drawbacks in multi-objective methods: a fundamental issue is to find strong Pareto policies which are policies whose cost-function value is the closest in Euclidean norm to the utopian point. Each strong Pareto policy is reached when each cost-function, constrained by the strategy of others, cannot improve further its own criterion. Constraints associated to the objective function are implemented formulating the problem as a bi-level optimization approach. We convert the problem into a single level optimization approach by introducing a generalized Lagrangian function to represent the original multi-objective problem in terms of a related nonlinear programming problem. Then, we apply the Tikhonov regularization method to the objective function. The regularization method ensures that all the possible Pareto policies to be generated along the Pareto front are strong Pareto policies. For solving the problem we employ the extra-proximal method. The method effectively approximates to every optimal Pareto point, which in this case is a strong Pareto point, in the Pareto front. The experimental result, applied to the route selection for counter-kidnapping problem, validates the effectiveness and usefulness of the method.
Similar content being viewed by others
References
Aiyoshi, E., & Shimizu, K. (1981). Hierarchical decentralized systems and its new solution by abarrier method. IEEE Transactions on Systems, Man, and Cybernetics, 11, 444–449.
Alves, M. J., & Clímaco, J. (2007). A review of interactive methods for multiobjective integer and mixed-integer programming. European Journal of Operational Research, 180(1), 99–115.
Antipin, A. S. (2005). An extraproximal method for solving equilibrium programming problems and games. Computational Mathematics and Mathematical Physics, 45(11), 1893–1914.
Bard, J., & Falk, J. (1982). An explicit solution to the multi-level programming problem. Computers & Operations Research, 9, 77–100.
Barrett, L., & Narayanan, S. (2008) Learning all optimal policies with multiple criteria. In Proceedings of the 25th international conference on machine learning (ICML ’08), Helsinki, Finland, pp. 41–47
Beltrami, E., Katehakis, M., & Durinovic, S. (1985). Multiobjective markov decisions in urban modelling. Mathematical Modelling, 6, 333–338.
Benayoun, R., De Montgolfier, J., Tergny, J., & Laritchev, O. (1971). Linear programming with multiple objective functions: Step method (stem). Mathematical Programming, 1, 366–375.
Bianco, L., Caramia, M., & Giordani, S. (2009). A bilevel flow model for hazmat transportation network design. Transportation Research Part C: Emerging Technologies, 17(2), 175–196.
Chang, Y. (2015). A leader-follower partially observed, multiobjective markov game. Annals of Operations Research, 235(1), 103–128.
Chiandussi, G., Codegone, M., Ferrero, S., & Varesio, F. (2012). Comparison of multi-objective optimization methodologies for engineering applications. Computers & Mathematics with Applications, 63, 912–942.
Chinchuluun, A., & Pardalos, P. M. (2007). A survey of recent developments in multiobjective optimization. Annals of Operations Research, 154, 29–50.
Clempner, J. B. (2016). Necessary and sufficient Karush–Kuhn–Tucker conditions for multiobjective markov chains optimality. Automatica, 71, 135–142.
Clempner, J. B., & Poznyak, A. S. (2014). Simple computing of the customer lifetime value: A fixed local-optimal policy approach. Journal of Systems Science and Systems Engineering, 23(4), 439–459.
Clempner, J. B., & Poznyak, A. S. (2015). Stackelberg security games: Computing the shortest-path equilibrium. Expert Systems with Applications, 42(8), 3967–3979.
Clempner, J. B., & Poznyak, A. S. (2016). Solving the pareto front for nonlinear multiobjective Markov chains using the minimum Euclidean distance optimization method. Mathematics and Computers in Simulation, 119, 142–160.
Clempner, J. B., & Poznyak, A. S. (2017). Multiobjective markov chains optimization problem with strong pareto frontier: Principles of decision making. Expert Systems With Applications, 68, 123–135.
Clempner, J. B., & Poznyak, A. S. (2018). A Tikhonov regularization parameter approach for solving Lagrange constrained optimization problems. Engineering Optimization. https://doi.org/10.1080/0305215X.2017.1418866 (To be published).
Das, I., & Dennis, J. E. (1997). A closer look at drawbacks of minimizing weighted sums of objectives for Pareto set generation in multi-criteria optimization problems. Structural and Multidisciplinary Optimization, 14, 63–69.
Das, I., & Dennis, J. E. (1998). Normal-boundary intersection: An alternate approach for generating Pareto-optimal points in multicriteria optimization problems. SIAM Journal on Optimization, 8, 631–657.
Deb, K. (1999). Multi-objective genetic algorithms: Problem difficulties and construction of test problems. Evolutionary Computation, 7, 205–230.
Deb, K. (2001). Nonlinear goal programming using multi-objective genetic algorithms. Journal of the Operational Research Society, 52, 291–302.
Dempe, S. (2001). Discrete bilevel optimization problems. Technical report, Institut fur Wirtschaftsinformatik, Universitat Leipzig, Leipzig, Germany.
DeNegre, S., & Ralphs, T. (2009). A branch-and-cut algorithm for integer bilevel linear programs. Operations Research and Cyber-Infrastructure, 47, 65–78.
Eichfelder, G. (2008). Adaptive scalarization methods in multiobjective optimization. Berlin: Springer.
Fampa, M., Barroso, L., Candal, D., & Simonetti, L. (2008). Bilevel optimization applied to strategic pricing in competitive electricity markets. Computational Optimization and Applications, 39(2), 121–142.
Fliege, J., & Heseler, A. (2003). Constructing approximations to the efficient set of convex quadratic multi-objective problems. Tech. rep.: University of Dortmund, Germany.
Fu, Y., & Diwekar, U. M. (2004). An efficient sampling approach to multiobjective optimization. Annals of Operations Research, 132(1–4), 109–134.
Herskovits, J., Leontiev, A., Das, G., & Santos, G. (2000). Contact shape optimization: A bilevel programming approach. Structural and Multidisciplinary Optimization, 20, 214–221.
Hwang, C., & Masud, A. (1979). Multiple objective decision making, methods and applications: A state-of-the art survey. Berlin: Springer.
Kim, I., & de Weck, O. (2005). Adaptive weighted-sum method for bi-objective optimization: Pareto front generation. Structural and Multidisciplinary Optimization, 29, 149–158.
Koppe, M., Queyranne, M., & Ryan, C. T. (2009). A parametric integer programming algorithm for bilevel mixed integer programs. Journal of Optimization Theory and Applications, 146(1), 137–150.
Lau, H. C., Yuan, Z., & Gunawan, A. (2016). Patrol scheduling in urban rail network. Annals of Operations Research, 239(1), 317–342.
Leigh, J., Dunnett, S., & Jackson, L. (2017). Predictive police patrolling to target hotspots and cover response demand. Annals of Operations Research,. https://doi.org/10.1007/s10479-017-2528-x.
Li, K., Kwong, S., Zhang, Q., & Deb, K. (2015). Interrelationship-based selection for decomposition multiobjective optimization. IEEE Transactions on Cybernetics, 45(10), 2076–2088.
Naoum-Sawaya, J., & Elhedhli, S. (2011). Controlled predatory pricing in a multiperiod stackelberg game: An MPEC approach. Journal of Global Optimization, 50, 345–362.
Pirotta, M., Parisi, S., & Restelli, M. (2015) Multi-objective reinforcement learning with continuous Pareto frontier approximation. In Proceedings of the twenty-ninth AAAI conference on artificial intelligence.
Poznyak, A. S., Najim, K., & Gomez-Ramirez, E. (2000). Self-learning control of finite Markov chains. New York: Marcel Dekker.
Roijers, D. M., Vamplew, P., Whiteson, S., & Dazeley, R. (2013). A survey of multi-objective sequential decision-making. Journal of Artificial Intelligence Research, 48, 67–113.
Salmeron, J., Wood, K., & Baldick, R. (2004). Analysis of electric grid security under terrorist threat. IEEE Transactions on Power Systems, 19(2), 905–912.
Salukvadze, M. E. (1979). Vector-valued optimization problems in control theory. New York: Academic Press.
Schittkowski, K. (1999). Easy-opt: An interactive optimization system with automatic differentiation—User’s guide. Tech. rep.: Department of Mathematics, University of Bayreuth.
Sheng, W., Liu, Y., Meng, X., & Zhang, T. (2012). An improved strength pareto evolutionary algorithm 2 with application to the optimization of distributed generations. Computers & Mathematics with Applications, 64(5), 944–955.
Steuer, R. E. (1989). The Tchebyche procedure of interactive multiple objective programming. In Multiple criteria decision making and risk analysis using microcomputers (pp. 235–249). Springer, Berlin.
Tanaka, K. (1989). The closest solution to the shadow minimum of a cooperative dynamic game. Computers & Mathematics with Applications, 18(1–3), 181–188.
Tanaka, K., & Yokoyama, K. (1991). On \(\epsilon \)-equilibrium point in a noncooperative n-person game. Journal of Mathematical Analysis and Applications, 160, 413–423.
Tappeta, R., & Renaud, J. (1999). Interactive multiobjective optimization procedure. AIAA Journal, 37(7), 881–889.
Tind, J., & Wiecek, M. M. (1999). Augmented lagrangian and tchebycheff approaches in multiple objective programming. Journal of Global Optimization, 14, 251–266.
Trejo, K. K., Clempner, J. B., & Poznyak, A. S. (2015a). Computing the Stackelberg/Nash equilibria using the extraproximal method: Convergence analysis and implementation details for Markov chains games. International Journal of Applied Mathematics and Computer Science, 25(2), 337–351.
Trejo, K. K., Clempner, J. B., & Poznyak, A. S. (2015b). A Stackelberg security game with random strategies based on the extraproximal theoretic approach. Engineering Applications of Artificial Intelligence, 37, 145–153.
Vamplew, P., Dazeley, R., Barker, E., & Kelarev, A. (2009) Constructing stochastic mixture policies for episodic multiobjective reinforcement learning task. In Lecture Notes in Computer Science: Advances in artificial intelligence (Vol. 5866, pp. 340–349). Berlin: Springer.
Wakuta, K., & Togawa, K. (1998). Solution procedures for multi-objective Markov decision processes. Optimization, 43, 29–46.
Wierzbicki, P. (1980). Multiple criteria decision making theory and applications (pp. 468–486). Berlin: Springer.
Xia, H., Zhuang, J., & Yu, D. (2014). Multi-objective unsupervised feature selection algorithm utilizing redundancy measure and negative epsilon-dominance for fault diagnosis. Neurocomputing, 146, 113–124.
Xinjie, Y., & Mitsuo, G. (2010). Introduction to evolutionary algorithms. London: Springer.
Zadeh, L. (1963). Optimality and non-scalar-valued performance criteria. IEEE Transactions on Automatic Control, 8(1), 59–60.
Zitzler, E., Knowles, J., & Thiele, L. (2008). Quality assessment of Pareto set approximations. In Lecture Notes in Computer Science: Multiobjective optimization (Vol. 5252, pp. 373–404). Berlin: Springer.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Clempner, J.B. Computing multiobjective Markov chains handled by the extraproximal method. Ann Oper Res 271, 469–486 (2018). https://doi.org/10.1007/s10479-018-2755-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10479-018-2755-9