Original articles
Solving the Pareto front for multiobjective Markov chains using the minimum Euclidean distance gradient-based optimization method

https://doi.org/10.1016/j.matcom.2015.08.004Get rights and content

Highlights

  • Present a novel method based on minimizing the Euclidean distance.

  • Introduce Tikhonov’s regularization method for ensuring strict-convexity of Pareto front.

  • Propose a linear constraints over the nonlinear problem employing the c-variable method.

  • Generate an even representation of the entire Pareto surface employing a distance restriction.

  • Present an algorithm for solving multi-objective Markov chains problems.

Abstract

A novel method based on minimizing the Euclidean distance is proposed for generating a well-distributed Pareto set in multi-objective optimization for a class of ergodic controllable Markov chains. The proposed approach is based on the concept of strong Pareto policy. We consider the case where the search space is a non-strictly convex set. For solving the problem we introduce the Tikhonov’s regularization method and implement the Lagrange principle. We formulate the original problem introducing linear constraints over the nonlinear problem employing the c-variable method and constraining the cost-functions allowing points in the Pareto front to have a small distance from one another. As a result, the proposed method generates an even representation of the entire Pareto surface. Then, we propose an algorithm to compute the Pareto front and provide all the details needed to implement the method in an efficient and numerically stable way. As well, we prove the main Theorems for describing the dependence of the saddle point for the regularizing parameter and analyzes its asymptotic behavior. Moreover, we analyze the step size parameter of the Lagrange principle and also its asymptotic behavior. The suggested approach is validated theoretically and verified by a numerical example related to security patrolling that present a technique for visualizing the Pareto front.

Introduction

In the traditional optimal Markov control problem the main goal is to find an optimal policy (strategy) that optimizes a single objective function  [24]. The optimal policy is a point where the given objective function assumes its minimum, if a solution exists. On the other hand, the multi-objective optimization problem (MOP) is related to optimize several functions at the same time. The optimum policy of an individual function is different from the optima policies of the other objective functions.

The fundamental problem is to construct the Pareto front composed of an infinite number of the so-called non-dominated points. In particular, a policy that minimizes the objective function in the sense of Pareto is said to be a Pareto policy. An utopia point is determined by the infimum of the objective function. A key issue for constructing the Pareto set is to find the strong Pareto policies, determined by objective function, that are closest to the utopia policies in the sense of the usual Euclidean norm.

Multi-objective optimization is a very interesting area of research. For a survey of different types nonlinear MOPs we refer to  [17] and [16] and, in the linear case to  [13] and  [29]. A different approach to tackle the problem, advantageous in the situation where the MOP is discrete, is by using Evolutionary Algorithms (see  [34], [6], [5], [9], [14]) or Particle Swarm Optimization (see  [8], [18], [15], [12]. A method which is based on a stochastic approach is presented in  [28], continuation or homotopy in  [25], [26], and a geometrically motivated methods are in  [4], [27]. Another way to compute the entire Pareto set is to use subdivision techniques (see  [7]). Different methods focus on defining algorithms producing a well-distributed Pareto set  [21], [22], [33].

The algorithms for finding the Pareto set presented in the literature are structurally stable, in the sense that they are able to build talented representations of the Pareto front when the functions are convex (strictly convex). Most of the existing solutions, supported by a local search approach based on classical linear and nonlinear programming, suppose that always a solution exists. However, in general there is a serious problem: the search space is in most of the cases a non-strictly convex set. This kind of behavior represents a typical situation (it is not artificially achieved for unrealistic objective functions).

Tikhonov regularization  [31], [30] is one of the most popular approaches to solve discrete ill-posed of the minimization problemminxAxb. The method seeks to determine a useful approximation of x by replacing the minimization problem (1) by a penalized least-squares problem of the form minxAxb2+δLx2 with regularization parameter δ>0 is chosen to control the size of the solution vector.

We consider the problem of minimizing the Euclidean distance to a given affine space min12x2:Ax=b. The main problem is proving the existence and characterization of strong Pareto policies. The existence of Pareto, weak Pareto, and proper Pareto policies is easy to compute because it can be obtained using the traditional scalarization method. We present an original formulation in terms of a coupled nonlinear programming problem implementing the Lagrange principle. For solving the existence and characterization of strong Pareto policies we employed the Tikhonov’s regularization method. Regularization refers to a process of introducing additional information in order to solve an ill-posed problem. Specifically, Tikhonov regularization is a trade-off between fitting the data and reducing a norm of the solution ensuring the convergence of the objective functions to a local Pareto optimal policy. Each equation in this system is an optimization problem for which the necessary condition of a minimum is solve using the projection gradient method. For continuation proposes we restrict the cost-functions allowing points in the Pareto front to have a small distance from one another. In addition, we employ the c-variable method for the introduction of linear constraints over the nonlinear problem. Moreover, we present the convergence conditions and compute the estimate rate of convergence of variables corresponding to the Lagrange principle and the Tikhonov’s regularization respectively. We suggest an algorithm for computing the Pareto front and provide all the details needed to implement the algorithm in an efficient and numerically stable way. We also prove the main Theorems for describing the dependence of the saddle point for the regularizing parameter δ and analyzes its asymptotic behavior when δ0 and, we study the step size parameter γ and, also its asymptotic behavior when γ0. We validate the method theoretically and prove its usefulness by a numerical example related to security patrolling involving a technique for visualizing the Pareto front.

The remainder of the paper is organized as follows. The following section presents the mathematical background on multi-objective optimization for Markov chains needed to understand the rest of the paper. We also present the main Lemmas and Theorems on ergodicity for Markov chains. Next, in Section  3, we introduce the multi-objective problem formulation including the Pareto optimality properties and the minimum Euclidean distance problem solved by introducing the Tikhonov’s regularization method and implementing the Lagrange principle. Then, in Section  4, we outline a complete description of an algorithm for computing the Pareto front which is the main result of this paper, and we also provide all the details needed to implement the algorithm in an efficient and numerically stable way. Section  5 provides the main Theorems for describing the dependence of the saddle point for the regularizing parameter and analyzes its asymptotic behavior and, studies the step size parameter and also its asymptotic behavior. We present a numerical example related to security patrolling for validating the proposed method, in Section  6. We close, in Section  7, by summarizing our contributions and outlining some directions for future work.

Section snippets

Multi-objective Markov chains

The Markov property of a Markov chain is said to be fulfilled if P(s(n+1)|(s(1),s(2),,s(n1)),s(n)=s(i),a(n)=a(k))=P(s(n+1)|s(n)=s(i),a(n)=a(k)).

The strategy (policy) d(k|i)(n)P(a(n)=a(k)|s(n)=s(i)) represents the probability measure associated with the occurrence of an action a(n)=a(k) from state s(n)=s(i).

The elements of the transition matrix for the controllable Markov chain can be expressed as P(s(n+1)=s(j)|s(n)=s(i))=k=1MP(s(n+1)=s(j)|s(n)=s(i),a(n)=a(k))d(k|i)(n).

Let us denote the

Pareto optimality

To study the existence of Pareto policies we shall first follow the well-known “scalarization’ approach [19],  [20]. Thus, given a n-vector λ>0 we consider the scalar (or real-valued) cost-function J.

The Pareto set can be defined as [10], [11]P{c(λ)argmincCadm[lNλlJl(c)],λSN} such that SN{λRN:λ[0,1],lNλl=1}. The Pareto front is defined as the image of P under J as follows J(P){(J1(c(λ)),J2(c(λ)),,JN(c(λ)))|cP}.

We consider the usual partial order for n-vectors x and y, the

Computing the Pareto front

The aim of this section is to present an algorithm to obtain the optimal Pareto policies c and the optimal scalar values of λ in order to compute the Pareto front under the constrains given by (6) and (22).

The method proposed to compute the Pareto front is as follows:

  • 1.

    Fix the parameters and initial conditions: ϵ,λ0,δ0,θ,ρ,γ0,n0

  • 2.

    Compute the utopian points c̄argminc0Fδ(c,λ=0,μ) and cˆargminc0Fδ(c,λ=1,μ)

  • 3.

    Define the numerical sequences as follows (see in Section  5Theorem 11, Theorem 12): γn={

Rate of convergence

Then, we present the convergence conditions and compute the estimate rate of convergence of the variables γ and δ   [24], [23]. We analyze the regularizing parameter δ and its asymptotic behavior when δ0. We also analyze the step size parameter γ and its asymptotic behavior when γ0.

Theorem 11

Within the class of numerical sequencesγn=γ0(n+n0)γγ0,n0,γ>0δn=δ0(n+n0)δδ0,δ>0the step size γn and the regularizing parameter δn satisfy the following conditions:0<γn0,0<δn0when   nn=0γnδn=γnδnξ  

Numerical example: security patrolling

Our example has focused on real-world domains where security agencies protect the branches of a bank from diverse set of attackers  [1], [32], [3]. The branch is the only channel of access to a financial institution’s services providing: cash withdrawals, deposits, safe deposit box rentals, insurance, etc. The defender has to consider multiple, and potentially conflicting, objectives when deciding upon a security policy. To solve the problem, we propose a multi-objective approach that has a set

Conclusion and future work

In this paper has been presented a novel method for generating a well-distributed Pareto set in nonlinear multi-objective optimization. The solution of the problem is restricted to a class of finite-time ergodic controllable Markov chains. The proposed method presents itself as a potential addition to the growing suite of Pareto generators, with potential advantages for ill-conditioned problems. For solving the problem we transformed the original multi-objective nonlinear problem into an

Acknowledgments

We would like to thank the anonymous referees for their valuable and helpful comments and suggestions on improving this paper.

References (34)

  • J.B. Clempner et al.

    Simple computing of the customer lifetime value: a fixed local-optimal policy approach

    J. Syst. Sci. Syst. Eng.

    (2014)
  • I. Das et al.

    Normal-boundary intersection: an alternate approach for generating pareto-optimal points in multicriteria optimization problems

    SIAM J. Opt.

    (1998)
  • K. Deb

    Multi-Objective Optimization Using Evolutionary Algorithms

    (2001)
  • K. Deb et al.

    A fast elitist non-dominated sorting genetic algorithm for multi-objective optimization: Nsga-ii

  • M. Dellnitz et al.

    Covering pareto sets by multilevel subdivision techniques

    J. Optim. Theory Appl.

    (2005)
  • J.E. Fieldsend, S. Singh, A multi-objective algorithm based upon particle swarm optimization, an efficient data...
  • C.M. Fonseca, P.J. Fleming, E. Zitzler, K. Deb, and L. Thiele (Eds.), Second International Conference on Evolutionary...
  • Cited by (16)

    • Computing fortification games in a tri-level Stackelberg Markov chains approach

      2023, Engineering Applications of Artificial Intelligence
    • Reveling misleading information for defenders and attackers in repeated Stackelberg Security Games

      2022, Engineering Applications of Artificial Intelligence
      Citation Excerpt :

      Zhang and Ramirez-Marquez (2013) developed a multi-objective optimization model for expressing a two-stage game with imperfect information, which computes the Pareto equilibrium solutions for the sequential game while considering multiple potential defense and assault techniques. Clempner and Poznyak (2016b) proposed a security solution for comprehensive information based on the Pareto front for multi-objective Markov chains using the least Euclidean distance gradient-based optimization approach using the minimum Euclidean distance gradient-based optimization method. Jiang and Liu (2018) described a multi-objective Stackelberg game technique based on partial knowledge to solve the challenge of defending water supply networks.

    • A Tikhonov regularized penalty function approach for solving polylinear programming problems

      2018, Journal of Computational and Applied Mathematics
      Citation Excerpt :

      Borsdorff et al. [14] proposed a new algorithm as an extension of the least-squares profile scaling which permits the computation of total column averaging kernels on arbitrary vertical grids using an analytic expression. Clempner and Poznyak [3] suggested an approach for solving the Pareto front in terms of Markov chains using the Lagrange principle implementation providing strong convexity for the cost-functions and for computing the strong Pareto policies. In addition to the above classical approaches, quite a few studies based on other techniques have been proposed including preconditioning [15], as well as optimization tools [16,17].

    • Multiobjective Markov chains optimization problem with strong Pareto frontier: Principles of decision making

      2017, Expert Systems with Applications
      Citation Excerpt :

      Etessami, Kwiatkowska, and Yannakakis (2007) studied efficient algorithms for multi-objective model checking problems which runs in time polynomial in the size of the MDP. Clempner and Poznyak (2016c) provided a method based on minimizing the Euclidean distance is proposed for generating a well-distributed Pareto set in multi-objective optimization for a class of ergodic controllable Markov chains. Clempner (2016) proposed to follow the Karush–Kuhn–Tucker (KKT) optimization approach where the optimality necessary and sufficient conditions are elicited naturally for a Pareto optimal solution in MOMDP.

    • Conforming coalitions in Markov Stackelberg security games: Setting max cooperative defenders vs. non-cooperative attackers

      2016, Applied Soft Computing Journal
      Citation Excerpt :

      We close the paper with the conclusion in Section 7. To study the existence of Pareto policies we shall first follow the well-known “scalarization” approach [32,33]. Thus, given a n-vector λ > 0 we consider the cost-function J.

    • Necessary and sufficient Karush-Kuhn-Tucker conditions for multiobjective Markov chains optimality

      2016, Automatica
      Citation Excerpt :

      The goal is to represent the whole efficient set to provide the decision maker with a practical understanding of the problem structure. There has been a great deal of effort by the researchers in the area for developing methods to generate an approximation of the Pareto front, see e.g. Clempner and Poznyak (2016), Dutta and Kaya (2011), Engau and Wiecek (2007), Haimes and Chankong (1979), Mueller-Gritschneder, Graeb, and Schlichtmann (2009) and Zitzler, Knowles, and Thiele (2008). The efficiency of the solution set depends significantly in the application approach.

    View all citing articles on Scopus
    View full text