Original articlesSolving the Pareto front for multiobjective Markov chains using the minimum Euclidean distance gradient-based optimization method
Introduction
In the traditional optimal Markov control problem the main goal is to find an optimal policy (strategy) that optimizes a single objective function [24]. The optimal policy is a point where the given objective function assumes its minimum, if a solution exists. On the other hand, the multi-objective optimization problem (MOP) is related to optimize several functions at the same time. The optimum policy of an individual function is different from the optima policies of the other objective functions.
The fundamental problem is to construct the Pareto front composed of an infinite number of the so-called non-dominated points. In particular, a policy that minimizes the objective function in the sense of Pareto is said to be a Pareto policy. An utopia point is determined by the infimum of the objective function. A key issue for constructing the Pareto set is to find the strong Pareto policies, determined by objective function, that are closest to the utopia policies in the sense of the usual Euclidean norm.
Multi-objective optimization is a very interesting area of research. For a survey of different types nonlinear MOPs we refer to [17] and [16] and, in the linear case to [13] and [29]. A different approach to tackle the problem, advantageous in the situation where the MOP is discrete, is by using Evolutionary Algorithms (see [34], [6], [5], [9], [14]) or Particle Swarm Optimization (see [8], [18], [15], [12]. A method which is based on a stochastic approach is presented in [28], continuation or homotopy in [25], [26], and a geometrically motivated methods are in [4], [27]. Another way to compute the entire Pareto set is to use subdivision techniques (see [7]). Different methods focus on defining algorithms producing a well-distributed Pareto set [21], [22], [33].
The algorithms for finding the Pareto set presented in the literature are structurally stable, in the sense that they are able to build talented representations of the Pareto front when the functions are convex (strictly convex). Most of the existing solutions, supported by a local search approach based on classical linear and nonlinear programming, suppose that always a solution exists. However, in general there is a serious problem: the search space is in most of the cases a non-strictly convex set. This kind of behavior represents a typical situation (it is not artificially achieved for unrealistic objective functions).
Tikhonov regularization [31], [30] is one of the most popular approaches to solve discrete ill-posed of the minimization problem The method seeks to determine a useful approximation of by replacing the minimization problem (1) by a penalized least-squares problem of the form with regularization parameter is chosen to control the size of the solution vector.
We consider the problem of minimizing the Euclidean distance to a given affine space The main problem is proving the existence and characterization of strong Pareto policies. The existence of Pareto, weak Pareto, and proper Pareto policies is easy to compute because it can be obtained using the traditional scalarization method. We present an original formulation in terms of a coupled nonlinear programming problem implementing the Lagrange principle. For solving the existence and characterization of strong Pareto policies we employed the Tikhonov’s regularization method. Regularization refers to a process of introducing additional information in order to solve an ill-posed problem. Specifically, Tikhonov regularization is a trade-off between fitting the data and reducing a norm of the solution ensuring the convergence of the objective functions to a local Pareto optimal policy. Each equation in this system is an optimization problem for which the necessary condition of a minimum is solve using the projection gradient method. For continuation proposes we restrict the cost-functions allowing points in the Pareto front to have a small distance from one another. In addition, we employ the -variable method for the introduction of linear constraints over the nonlinear problem. Moreover, we present the convergence conditions and compute the estimate rate of convergence of variables corresponding to the Lagrange principle and the Tikhonov’s regularization respectively. We suggest an algorithm for computing the Pareto front and provide all the details needed to implement the algorithm in an efficient and numerically stable way. We also prove the main Theorems for describing the dependence of the saddle point for the regularizing parameter and analyzes its asymptotic behavior when and, we study the step size parameter and, also its asymptotic behavior when . We validate the method theoretically and prove its usefulness by a numerical example related to security patrolling involving a technique for visualizing the Pareto front.
The remainder of the paper is organized as follows. The following section presents the mathematical background on multi-objective optimization for Markov chains needed to understand the rest of the paper. We also present the main Lemmas and Theorems on ergodicity for Markov chains. Next, in Section 3, we introduce the multi-objective problem formulation including the Pareto optimality properties and the minimum Euclidean distance problem solved by introducing the Tikhonov’s regularization method and implementing the Lagrange principle. Then, in Section 4, we outline a complete description of an algorithm for computing the Pareto front which is the main result of this paper, and we also provide all the details needed to implement the algorithm in an efficient and numerically stable way. Section 5 provides the main Theorems for describing the dependence of the saddle point for the regularizing parameter and analyzes its asymptotic behavior and, studies the step size parameter and also its asymptotic behavior. We present a numerical example related to security patrolling for validating the proposed method, in Section 6. We close, in Section 7, by summarizing our contributions and outlining some directions for future work.
Section snippets
Multi-objective Markov chains
The Markov property of a Markov chain is said to be fulfilled if
The strategy (policy) represents the probability measure associated with the occurrence of an action from state .
The elements of the transition matrix for the controllable Markov chain can be expressed as
Let us denote the
Pareto optimality
To study the existence of Pareto policies we shall first follow the well-known “scalarization’ approach [19], [20]. Thus, given a -vector we consider the scalar (or real-valued) cost-function .
The Pareto set can be defined as [10], [11] such that The Pareto front is defined as the image of under as follows
We consider the usual partial order for -vectors and , the
Computing the Pareto front
The aim of this section is to present an algorithm to obtain the optimal Pareto policies and the optimal scalar values of in order to compute the Pareto front under the constrains given by (6) and (22).
The method proposed to compute the Pareto front is as follows:
- 1.
Fix the parameters and initial conditions:
- 2.
Compute the utopian points and
- 3.
Define the numerical sequences as follows (see in Section 5Theorem 11, Theorem 12):
Rate of convergence
Then, we present the convergence conditions and compute the estimate rate of convergence of the variables and [24], [23]. We analyze the regularizing parameter and its asymptotic behavior when . We also analyze the step size parameter and its asymptotic behavior when .
Theorem 11 Within the class of numerical sequencesthe step size and the regularizing parameter satisfy the following conditions:
Numerical example: security patrolling
Our example has focused on real-world domains where security agencies protect the branches of a bank from diverse set of attackers [1], [32], [3]. The branch is the only channel of access to a financial institution’s services providing: cash withdrawals, deposits, safe deposit box rentals, insurance, etc. The defender has to consider multiple, and potentially conflicting, objectives when deciding upon a security policy. To solve the problem, we propose a multi-objective approach that has a set
Conclusion and future work
In this paper has been presented a novel method for generating a well-distributed Pareto set in nonlinear multi-objective optimization. The solution of the problem is restricted to a class of finite-time ergodic controllable Markov chains. The proposed method presents itself as a potential addition to the growing suite of Pareto generators, with potential advantages for ill-conditioned problems. For solving the problem we transformed the original multi-objective nonlinear problem into an
Acknowledgments
We would like to thank the anonymous referees for their valuable and helpful comments and suggestions on improving this paper.
References (34)
- et al.
Stackelberg security games: computing the shortest-path equilibrium
Expert Syst. Appl.
(2015) - et al.
An augmented multi-objective particle swarm optimizer for building cluster operation decisions
Appl. Soft Comput.
(2014) An evolutionary algorithm for multi-criteria inverse optimal value problems using a bilevel optimization model
Appl. Soft Comput.
(2014)- et al.
A novel combination of particle swarm optimization and genetic algorithm for pareto optimal design of a five-degree of freedom vehicle vibration model
Appl. Soft Comput.
(2013) Fast computation of equispaced pareto manifolds and pareto fronts for multi-objective optimization problems
Math. Comput. Simul.
(2009)- et al.
Equispaced pareto front construction for constrained bi-objective optimization
Math. Comput. Modelling
(2013) - et al.
Characterization of graph properties for improved pareto fronts using heuristics and ea for bi-objective graph coloring problem
Appl. Soft Comput.
(2013) - et al.
A stackelberg security game with random strategies based on the extraproximal theoretic approach
Eng. Appl. Artif. Intell.
(2015) - et al.
A method for generating a well-distributed pareto set in nonlinear multi-objective optimization
J. Comput. Appl. Math.
(2009) - et al.
An extended study on multi-objective security games
Auton. Agents and Multi-Agent Syst.
(2014)
Simple computing of the customer lifetime value: a fixed local-optimal policy approach
J. Syst. Sci. Syst. Eng.
Normal-boundary intersection: an alternate approach for generating pareto-optimal points in multicriteria optimization problems
SIAM J. Opt.
Multi-Objective Optimization Using Evolutionary Algorithms
A fast elitist non-dominated sorting genetic algorithm for multi-objective optimization: Nsga-ii
Covering pareto sets by multilevel subdivision techniques
J. Optim. Theory Appl.
Cited by (16)
Computing fortification games in a tri-level Stackelberg Markov chains approach
2023, Engineering Applications of Artificial IntelligenceReveling misleading information for defenders and attackers in repeated Stackelberg Security Games
2022, Engineering Applications of Artificial IntelligenceCitation Excerpt :Zhang and Ramirez-Marquez (2013) developed a multi-objective optimization model for expressing a two-stage game with imperfect information, which computes the Pareto equilibrium solutions for the sequential game while considering multiple potential defense and assault techniques. Clempner and Poznyak (2016b) proposed a security solution for comprehensive information based on the Pareto front for multi-objective Markov chains using the least Euclidean distance gradient-based optimization approach using the minimum Euclidean distance gradient-based optimization method. Jiang and Liu (2018) described a multi-objective Stackelberg game technique based on partial knowledge to solve the challenge of defending water supply networks.
A Tikhonov regularized penalty function approach for solving polylinear programming problems
2018, Journal of Computational and Applied MathematicsCitation Excerpt :Borsdorff et al. [14] proposed a new algorithm as an extension of the least-squares profile scaling which permits the computation of total column averaging kernels on arbitrary vertical grids using an analytic expression. Clempner and Poznyak [3] suggested an approach for solving the Pareto front in terms of Markov chains using the Lagrange principle implementation providing strong convexity for the cost-functions and for computing the strong Pareto policies. In addition to the above classical approaches, quite a few studies based on other techniques have been proposed including preconditioning [15], as well as optimization tools [16,17].
Multiobjective Markov chains optimization problem with strong Pareto frontier: Principles of decision making
2017, Expert Systems with ApplicationsCitation Excerpt :Etessami, Kwiatkowska, and Yannakakis (2007) studied efficient algorithms for multi-objective model checking problems which runs in time polynomial in the size of the MDP. Clempner and Poznyak (2016c) provided a method based on minimizing the Euclidean distance is proposed for generating a well-distributed Pareto set in multi-objective optimization for a class of ergodic controllable Markov chains. Clempner (2016) proposed to follow the Karush–Kuhn–Tucker (KKT) optimization approach where the optimality necessary and sufficient conditions are elicited naturally for a Pareto optimal solution in MOMDP.
Conforming coalitions in Markov Stackelberg security games: Setting max cooperative defenders vs. non-cooperative attackers
2016, Applied Soft Computing JournalCitation Excerpt :We close the paper with the conclusion in Section 7. To study the existence of Pareto policies we shall first follow the well-known “scalarization” approach [32,33]. Thus, given a n-vector λ > 0 we consider the cost-function J.
Necessary and sufficient Karush-Kuhn-Tucker conditions for multiobjective Markov chains optimality
2016, AutomaticaCitation Excerpt :The goal is to represent the whole efficient set to provide the decision maker with a practical understanding of the problem structure. There has been a great deal of effort by the researchers in the area for developing methods to generate an approximation of the Pareto front, see e.g. Clempner and Poznyak (2016), Dutta and Kaya (2011), Engau and Wiecek (2007), Haimes and Chankong (1979), Mueller-Gritschneder, Graeb, and Schlichtmann (2009) and Zitzler, Knowles, and Thiele (2008). The efficiency of the solution set depends significantly in the application approach.